[alto]Protocol between alto server and information sources


Gao Kai
 

Hi Richard,

Thanks for your comments.

The reason I think a standard might be helpful is exactly that an implementation can then get information from different sources without worrying about adaption issues.  I think such a standard, which decouples the development of ALTO servers and information sources, would be beneficial in the long run.


But private protocols can still exist for certain implementations.   Actually they can easily extend the protocol by replacing 'type' and 'source' with some private values and filling the 'data' field in their own format.

Regards,
Kai

On 03/07/15 09:55, Y. Richard Yang wrote:
Dear Kai,

Thanks for the proposal. Here is some initial feedback.

Please see below.


On Thu, Jul 2, 2015 at 12:26 AM, Gao Kai <gaok12@...> wrote:
Hi all,

I have some thoughts on the implementation of ALTO servers and I'd like to get some feedback on the idea.

The basic functionalities of an ALTO server are:

- Collecting data from the information sources;
- Publishing the information to clients (using ALTO protocol).

While the latter is well-defined in RFC 7285, there are no standards for the communication between an ALTO server and information sources.  There are three scenarios:

- The ALTO server is deeply embedded into the information source, just like what we are trying to do in Open Daylight.

- The ALTO server is partially embedded into the information source.  For example, in the early stage of out implementation in ODL, we used an external server which pulls data from ODL using RESTCONF and converts them into RFC7285-compatible formats.

- The ALTO server is decoupled from the information source.

It is for the last scenario that a standard protocol might be helpful. 

Provisioning of ALTO information is not defined in the ALTO protocol. In a totally private setting, the protocol and tool for an ALTO server to access/control each information source and pull related info from the source (or the source pushes related info to an ALTO server) can be totally private, and hence there is not a need to define a standard. 

I agree that in a more general setting, an ALTO server may use multiple information sources, and the sources may use different tools and belong to different domains. 

Here is a problem that I am thinking of how to solve: how to implement the endpoint cost service? This depends on the scenarios, as you identified. For example, here are two scenarios:

- All endpoints and the network connecting them belong to the same SDN controller. The ALTO server can use the state and control functions available in the SDN controller to obtain the info.

- It is not an SDN setting. Rather, assume a Science Network setting.  A tool comes to mind is perfSonar (https://fasterdata.es.net/performance-testing/perfsonar/), which is an infrastructure used in Science Networks to measure loss and bw results. The ALTO server may use the data from perfSonar (or even control on-demand measurement) to get the results (it may need to use the perfsonar hosts as landmark...)
 
I believe that the preceding will be the main complexity and value of providing ALTO. My understanding is that you want to introduce a standard here, so that an ALTO server can use a uniform interface to use multiple information sources (i.e., the ALTO server does not need to write a lot of adaptation code, each for query ODL, one for query ONOS, one for query PerfSonar, ...)?

Do I understand your intention?

Richard


To get started, I can think of two basic and probably most common implementation choices, and they both can have multiple different information sources:

- End-to-End:
  The server builds its maps using an full-mesh internal representation.  This can happen if the server is using end-to-end measurement methods or it just doesn't have the priority to fetch topology information.

- Topology-based:
  In this case the server is using a graph representation and there can be some internal nodes besides endpoints.  A server can fetch topology view directly, either from configurations or by making a query to a SDN controller.  Also one can use aggregated data such as the inferred AS graph.

Accordingly we can identify the following kinds of information for both implementation choices:

- Connection-based statistics;
- Link-based statistics;
- (?) Node-based statistics.

All three statistics can be encapsulated using the following JSON representation:

    {
        /* the type of the statistics */
        'type': 'alto-IS-e2e',
        /* choices: alto-IS-e2e, alto-IS-link, alto-IS-node, etc. */

        /* identity for the provider of the information */
        'source': 'grid-ftp-client:ef423dab',

        'data': [
            {
                'meta': {
                    /* e2e */
                    'src': 'ipv4:X.X.X.X',
                    'dst': 'ipv4:Y.Y.Y.Y',

                    /* link */
                    'id': 'e15',

                    /* node */
                    'id': 'n02'
                },
                'statistics': [
                    /* e2e */
                    'average-round-trip-time': '10ms',
                    'average-drop-rate-percentage': '5',
                    ...,

                    /* link */
                    'status': 'up',
                    'capacity': '10Gbps',
                    'available-bandwidth': '5Gbps',
                    ...,

                    /* node */
                    ...
                    /* actually I don't have any statistics for nodes in mind */
                ]
            },
            ...
        ]
    }

There are also considerations on push/pull modes, integration to IRD and potential DDoS threats but I'd like to hear some feedback on the proposal from you alto guys.

Thanks!

Regards,
Kai