What are Federations and why does research need them?
Today, research for a number of disciplines is conducted more in collaborations and less in small individual projects. The European Commission stimulates this by encouraging collaborations between research institutes across Europe. Under this encouragement, large collaborations for research have been formed in the past. Within these collaborations, making use of each other services were done on a case by case situation. It usually came down to a user obtaining the necessary credentials for the services she needed. Although this approach is cumbersome, the number of people were quite small, making this approach acceptable.
Nowadays, the need to share resources is increasing. Individual institutes can no longer fund some of the resources needed to do research by themselves. A number of universities for example can share their resources to provide for better services. Access to these facilities is often still based on getting a local account at the service. In addition, it involves some kind of process of validating the identity of the person and making sure she is allowed access to the requested resources. This process is often still manual. However, as the number of people making use of these facilities grows, so does the impact of the validation process. Moreover, once a user has obtained an account based on her current affiliation with her research institute, she will have access to the resources until the account expires. This is typically a year. During this year, a number of things can happen which invalidates the original request. The researcher might change her affiliation for example. In these kinds of situations, the service provider is completely depended on the research institute to be informed about the change. This will of course rarely happen and will only be detected when the account must be renewed.
Getting access to services does not need to be so difficult as described above. A user is usually affiliated with an institute and has currently an identity there. This identity could be used by the service, if it can trust the identity. That implicitly means trusting the institute. We can group a number of institutes and service providers that share a particular ambition and add users to it that belong to the institutes and want to make use of the provided service. When these parties agree on certain common rules, we can call this a federation.
Within a federation a number of roles can be identified. On the one hand we need identities and on the other services that accept these identities. This requires a trust relation between those two roles. The role of providing identities is fulfilled by an Identity provider (IdP) and role of providing a service, by a Service Provider (SP). Of course, there is the role of the user, which identity is verified by an IdP and consumed by an SP. Figure 1 below shows a typical overview of a federation.
Figure 1: Overview of a typical federation. Groups of federation members and partners make up the federation. Inside these groups, identity providers, services providers and users find their place.
The federation exists on the bases of trust and makes it possible to use a single identity across a range of services. This negates the fact that a user needs to acquire a local account for all the services she wishes to use. Once an identity has become invalid, the SP is guarded by inappropriate access to its services.
Validating an Identity from at the IdP
We have looked at what roles are present in a federation: IdPs, SPs and users. Also, we have glanced at what each role wants: use a service, obtain a validated and trusted identity, etc. This does not tell us how one can obtain an identity from an IdP and use it at an SP.
Getting a validated identity starts with a user trying to access a service. The service requires authentication, but cannot provide it. The reason being that it does not know the user, yet. Instead of rejecting the user, it presents the user with a list of trusted IdPs from which the user must select one. The user selects the IdP that manages the identity of the user. Next, the SP will redirect the user to the selected IdP and there the user is asked to provide credentials to authenticate herself. If authentication is successful, the IdP will create an assertion stating the successful authentication and hands it over to the user. The user can now present a trusted validated identity to the SP. Before the SP accepts the provided identity and grant the user access, it will verify the assertion. The technology used to generate the messages for necessary information exchange, is based on SAML. The above SAML flow is visualized in Figure 2.
As can be seen from the above SAML flow, a lot of steps are necessary to be able to successfully authenticate users. The browser helps considerably. The switching back and forth between SP and IdP is done by redirecting the browser to different URLs and in doing so, all necessary data are transferred alongside. This is quite convenient, relieving the user from a number of tasks and getting a better experience for the user.
Getting Federated Identities in the Hands of non-web Applications
What has been shown in this blog so far is a successful SAML authentication for browser-based environments, i.e. using services from within a browser. However, not all access to services are done through the browser. In fact, within research communities, a lot of scientists use command line tools to access the services they need. Examples are: secure shell (SSH), or specialized tools like iRODS. Non-web applications, or command line tools (CLI tools) cannot handle the SAML flow, at least not without modification. These are the challenges for non-web shared resources.
Bridging the web and non-web worlds
We could try to fix the command line tools. However, this is not very realistic to do. Instead, why not try to use as much as possible from the web world? After all, federated authentication has already been solved there. We can initiate the flow and complete the flow from the browser. If we could create a secret or token based on the successful federated authentication and come up with a method that allows the SP to check this, instead of the assertion, we have something that can work. Fortunately, there is such a solution.
On Linux, applications can make use of the Pluggable Authentication Modules (PAM) for authentication. PAM is a framework in which you define which modules need to be glued together in order to successfully authenticate users. The full set of modules and their configuration is called a stack. Each module performs part of the authentication. Since a lot of tooling supports PAM authentication, it is a good candidate to solve the issue of federated authentication for non-web applications. If we are able to use the PAM framework with its modules and feed the modules with information obtained from a federated authentication, we have solved a large part of the puzzle.
PAM itself is unsuited for implementing the SAML flow. Basically, PAM exists on the level of the applications and therefore is faced with the same difficulties of the applications. It lacks a browser as well. This does not mean PAM cannot be used, it means we need the proper PAM modules to work together. As stated before, PAM modules are configured in what is called a stack. One nice aspect of PAM is that each application that supports PAM, has its own stack. This gives us the ability to use unique passwords for applications. The next step is to tell PAM where it can find the user credentials, i.e. username and password combination. This is something that is well handled by an LDAP database and PAM has modules that know how to interact with LDAP. Configuring the PAM stack of the application gives PAM access to the user credentials stored in LDAP for that application. This results in an implementation for application specific passwords (ASP).
The above however is not yet a substitution for federated authentication as implemented in browsers. The IdP still needs to do the authentication and this has not happened yet. To get that part of the chain working, we need the browser. By implementing a portal that connects the web and non-web worlds, we are able to close the chain. The portal can be used to do the federated authentication and use this information to create an application specific password and store that in LDAP. The SP can then verify that generated password to authenticate the user. If the user is able to provide the password, it must have been able to authenticate itself at his IdP. The portal is a trusted component of course.
With the above components (PAM, LDAP, portal) in place we are able to:
- Setup a portal to obtain valid SAML assertions (steps 1 & 2 in Figure 3);
- Use the portal to update the application specific password of an authenticated user per application (step 3 in Figure 3);
- Let applications that support the PAM framework make use of the user credentials managed by the portal (steps 4 & 5 in Figure 3).
Figure 3: Overview of different high-level components for Application Specific Passwords. A user accesses a portal (1) that is able to authenticate the user at its IdP (2) and manages an ASP in an LDAP database (3). The user is then able to access the application (4) and provide the ASP to the application for authentication (5). The ASP can only be known to the user if the authentication at the IdP has been successful.
Other PAM modules that do authentication
In the example just discussed, we described a PAM module that asks for user credential to authenticate a user. There are other PAM modules that perform authentication on different grounds. These could be used as well and thereby we are not limited to credentials. Thereby offering a wider range of authentication methods.
One of those modules could be one-time passwords (OTP). There are a few advantages. They are easier to use by the user, because they are a series of 6 or 8 digits. In contrast to ASP, the lifetime of an OTP is quite limited and can be used once and once only. By implementing the necessary logic at the portal side and configuring the PAM module for OTPs, command line tools can make use of this type of authentication as well.
At the heart of the OTP is a shared secret. Knowing this secret and the current time the current 6 or 8 digits can reliably be calculated. If both the application through the means of the PAM module and the portal have the same secret, they can always generate the same 6 or 8 digits. Getting the secret out of the portal and into the PAM module poses a challenge at the moment as there is no standard method of doing so.
For users, it means that they no-longer need to memorize an ASP. Instead, they need to be able to generate 6 to 8 valid digits and provide that during the authentication phase of the application. Another nice advantage of OTP being a self-contained algorithm, i.e. the algorithm only needs the secret and current time, that it can run independent of a portal on mobile devices. Of course, there is also a risk here. If the device gets lost, the finder has instant access to the application of the user. To mitigate that problem, the secreted needs to be protected by a passphrase only the users knows.
Up until now we have introduced two methods of authentication: application specific passwords and one-time passwords. There is a third method: SSH keys. Instead of storing and ASP in an LDAP environment, you store a public SSH key. By ensuring the portal only allows authenticated users to upload their public SSH key and bind that to applications, we have introduced a third method for federated authentication.
A drawback of using SSH keys in the more traditional manner is that you have no control over the lifetime of the key. It is unlimited. However, storing it in LDAP gives the portal the ability to control the lifetime of the SSH key. Along with fetching the key from the server, the lifetime can be checked. If it has expired, the key is not accepted.
Current Status and Future Work
The ASP and OTP have been setup in a Proof of Concept (PoC) environment and have been demonstrated to work with an application (iRODS). With all pieces of technology successfully put together, the iRODS application can be used via federated authentication. More information and some technical details can be found at: https://surfdrive.surf.nl/files/index.php/s/OMgPDOiQunM8szs.
The PoC version of the portal needs to be developed further. Currently it implements only the basic functionality to support the PoC. However, users cannot manage the information about them in the portal. At the same time, some of the components need to be redesigned. We are currently investigating if we can leverage a tool called COmanage. It is a user management and registration application developed by Internet2, which can be easily extended with different workflows. It provides a number of necessary features that the current PoC portal has not yet implemented. In 2017 we want to develop a portal with COmanage that implements the management of an ASP, OTP and SSH keys. This should come with proper user registration and management.
LDAP deployment scenarios
The portal functionality depends on an LDAP infrastructure. For LDAP it does not really matter where the LDAP server is running. All that is needed is that access is granted. A number of deployment scenarios can be envisioned. The LDAP server can be deployed in the same domain as the portal and means that service providers depend on an LDAP server that is outside their span of control. On the other hand, the portal could easily support an LDAP that is within the domain of the SP. This means that SPs must permit outside access to at least part of their LDAP infrastructure. With proper Access Control Lists (ACL) the SP can limit the impact of outside access. For a more detailed overview of the different scenarios look at: https://wiki.surfnet.nl/display/NWFAP/Non-web+Federated+Authentication+Scenario+Overview
If you are interested and/or have any questions, please contact: firstname.lastname@example.org