Content Delivery Servers

Introduction

A Content Delivery Server (CDS) is a way to offload content serving from the Pulp server to a separate machine. A CDS is configured to host one or more repositories and consumers accessing those repositories are evenly distributed across CDS instances that serve them. The Pulp server is still used to manage and synchronize content into repositories; the CDS instances synchronize repositories from the Pulp server itself and make them available to consumers.

Having CDS instances present in a Pulp server influences the way consumer bindings function. If a CDS hosts a repository and a consumer request to bind the repo is made, the consumer will be directed to access the repo on the CDS. If multiple CDS instances all serve the same repository, consumers will be evenly distributed across all CDS instances.

Furthermore, consumers will be configured with the remaining CDS instances and the Pulp server itself as backups. This functionality works together to facilitate both load balancing and fail over scenarios to provide higher availability for hosted repositories.

Each CDS instance sends a heartbeat on a regular interval over the message bus. This is used to determine the status of the CDS as it pertains to connectivity and availability. The interval is specified in the /etc/cds.conf and can be different for each CDS. The heartbeat statistics are included in the list and info CDS commands.

Overview

Installation and Configuration

A server is configured to run as a CDS by first installing the Pulp CDS package and its dependencies:

$ yum install pulp-cds

If not already installed, httpd will be installed as part of the installation process. The virtual host for the CDS is also installed through the /etc/httpd/conf.d/pulp-cds.conf file.

The CDS will need to be able to resolve the hostname of the Pulp server in order to be able to download content from it. This will typically be done through DNS, but in development environments this often means needing to edit /etc/hosts to add an entry for the Pulp server.

Additionally, the CDS will have to be configured to use the messaging broker on the Pulp server so it can be manipulated from the server itself. This is done through the /etc/pulp/cds.conf file:

[server]
host = pulp.example.com

Like the Pulp server itself, the CDS uses Apache to host its repositories. The typical steps to configure the Apache instance with an SSL certificate should be taken at this point.

Once the configuration changes have been made, the CDS processes are started through the init script:

$ service pulp-cds start
Starting goferd                                            [  OK  ]
Starting httpd:                                            [  OK  ]

Once the CDS server is running, it must be registered to a Pulp server. A CDS may only be registered to one Pulp server at a given time. Once registered, the CDS will only accept commands from the Pulp server that it is currently registered to. When a CDS is unregistered from a Pulp server, it is once again open to be registered by a different Pulp server.


Display and History

Display

The list of CDS instances currently registered can be viewed with the cds list command.

$ pulp-admin cds list
+------------------------------------------+
                CDS Instances
+------------------------------------------+

Name                	US East CDS
Hostname            	cds1.example.org
Description         	None                     
Cluster                 None
Repos               	repo1                     
Last Sync           	Never                    
Status:
   Responding       	Yes                      
   Last Heartbeat   	2011-04-07 18:55:55 

Name                	US West CDS       
Hostname            	cds2.example.org       
Description         	None                   
Cluster                 cluster-1  
Repos               	repo1, repo2              
Last Sync           	Never       
Status:
   Responding       	No                      
   Last Heartbeat   	2011-02-01 18:55:55              

The following fields are displayed for each CDS instance:

Name Description
Name Display name of the CDS, provided at registration time.
Hostname Serves as a unique identification for the CDS.
Description Provides extra display information about the CDS.
Cluster If the CDS belongs to a cluster, the ID of the cluster is displayed.
Repos List of repositories that are currently associated with the CDS. This does not necessarily mean they have all been synchronized, but rather the Pulp server has been configured with the association. At the next synchronization, the CDS will take the necessary steps to serve only the repos in this list.
Last Sync Indicates the last time a synchronize operation was run on the CDS.
Status Indicates the status of the CDS as determined by heartbeat
(status) Responding Indicates CDS heartbeat is not overdue [Yes|No].
(status) Last Heartbeat Date & Time of last received heartbeat.

The details of a CDS instance that is currently registered can be viewed with the cds info command.

The info command takes the following arguments (all arguments are optional unless otherwise specified):

Hostname --hostname Required. Fully qualified domain name of the CDS. The CDS will use its hostname when determining its ID on the message bus, so this argument must be the same value the CDS will resolve.
$ pulp-admin cds info --hostname cds1.example.org
+------------------------------------------+
                   CDS
+------------------------------------------+

Name                	US East CDS
Hostname            	cds1.example.org
Description         	None                     
Cluster                 None
Repos               	repo1                     
Last Sync           	Never                    
Status:
   Responding       	Yes                      
   Last Heartbeat   	2011-04-07 18:55:55        

The following fields are displayed for the CDS instance:

Name Description
Name Display name of the CDS, provided at registration time.
Hostname Serves as a unique identification for the CDS.
Description Provides extra display information about the CDS.
Cluster If the CDS belongs to a cluster, the ID of the cluster is displayed.
Repos List of repositories that are currently associated with the CDS. This does not necessarily mean they have all been synchronized, but rather the Pulp server has been configured with the association. At the next synchronization, the CDS will take the necessary steps to serve only the repos in this list.
Last Sync Indicates the last time a synchronize operation was run on the CDS.
Status Indicates the status of the CDS as determined by heartbeat
(status) Responding Indicates CDS heartbeat is not overdue [Yes|No].
(status) Last Heartbeat Date & Time of last received heartbeat.

History

Further information about an individual CDS can be retrieved using the cds history command. The following operations are tracked:

The simplest form of the cds history command simply specifies the hostname of the CDS being queried:

$ pulp-admin cds history --hostname cds1.example.org
+------------------------------------------+
                 CDS History
+------------------------------------------+

Event Type          	Repo Unassociated        
Timestamp           	2011-03-08 18:18:34      
Originator          	admin                    
Repo ID               	simple-repo              

Event Type          	Repo Associated          
Timestamp           	2011-03-08 18:09:32      
Originator          	admin                    
Repo ID               	simple-repo              

Event Type          	Registered               
Timestamp           	2011-03-08 18:08:32      
Originator          	admin        

A number of query arguments may be passed in to the cds history command in order to refine the results. The following query parameters are provided.

Name Flag Description
Event Type --event_type Limits the results to only those that match the given event type. The programmatic names for the event types can be found by running the cds history --help command.
Limit --limit Only displays the given number of history entries. Value must be greater than zero.
Sort --sort Sorts the history entries according to timestamp. Valid values are "ascending" and "descending". Default is descending.
Start Date --start_date Limits the returned entries to on or after the given date. The format follows the Date Units guidelines.
End Date --end_date Limits the returned entries to on or before the given date. The format follows the Date Units guidelines.

These parameters may be combined to form advanced queries.

$ pulp-admin cds history --hostname cds1.example.org --limit 1 --event_type repo_associated
+------------------------------------------+
                 CDS History
+------------------------------------------+

Event Type          	Repo Associated          
Timestamp           	2011-03-08 18:09:32      
Originator          	admin                    
Repo ID               	simple-repo              

Registration and Unregistration

Before a CDS can be used, it must be registered with a Pulp server. Prior to registering a CDS instance, the CDS packages must be installed and started on the instance. The registration process will attempt to contact the CDS and will fail if it is unable to do so.

CDS registration is done through the pulp-admin script. Registration takes the following arguments (all arguments are optional unless otherwise specified):

Name Flag Description
Hostname --hostname Required. Fully qualified domain name of the CDS. The CDS will use its hostname when determining its ID on the message bus, so this argument must be the same value the CDS will resolve.
Name --name Display name for the CDS (will default to the hostname if unspecified)
Description --description Only used for display purposes to further identify the CDS.
Cluster --cluster_id If the CDS should belong to a cluster, this specifies the cluster.

Registration Only Options

Name Flag Description
Schedule Interval --interval Time between scheduled syncs in iso8601 format
Schedule Start --start Date and time of first scheduled sync in iso8601 format (optional: not specified means to run immediately)
Schedule Runs --runs Total scheduled syncs (optional: not specified means run indefinitely)

Example

$ pulp-admin cds register --hostname cds1.example.org --name "US East CDS"
Successfully registered CDS [cds1.example.org]

Once a CDS is registered, repositories can be associated with it and synchronized to the instance.


Repo Association

Pulp repositories are assigned to CDS instances through a process called association. Once associated, a repository's contents will be sent to the CDS on the next sync.

Association

Association is done through the cds associate_repo command. The two arguments, the CDS hostname and repo ID, are both required.

$ pulp-admin cds associate_repo --hostname cds1.example.org --repoid repo1
Successfully associated CDS [cds1.example.org] with repo [repo1]

Note that the repository contents are not immediately sent to the CDS. This allows multiple repos to be configured for a CDS before initiating the synchronization. The sync must be manually run with cds sync.

Unassociation

Similarly, a previously associated repository can be removed from a CDS through the cds unassociate command. This command takes the same two required arguments.

$ pulp-admin cds unassociate_repo --hostname cds1.example.org --repoid repo1
Successfully associated repo [repo1] from CDS [cds1.example.org]

Synchronization

Synchronization is the process by which the CDS downloads content for all associated repositories. Any previously synchronized repos that have been unassociated from the CDS are removed during this process. Any updated content to existing repositories is downloaded to the CDS.

The cds sync command is used to trigger an immediate synchronization for the CDS indicated by the only argument --hostname.

$ pulp-admin cds sync --hostname cds1.example.org
Sync for CDS [cds1.example.org] started
Use "cds status" to check on the progress

The status of the sync, along with previous syncs, can be retrieved with cds status. Additionally, the --recent=N flag can be specified to view the results of the last N many sync operations.

$ pulp-admin cds status --hostname cds1.example.org --recent 1

+------------------------------------------+
                 CDS Status
+------------------------------------------+

Name                	cds1.example.org    
Hostname            	cds1.example.org    
Description         	None                     
Repos               	pulp-f13                 
Last Sync           	2011-03-10 17:25:55      

+------------------------------------------+
           Most Recent Sync Tasks
+------------------------------------------+

State               	running                  
Start Time          	2011-03-10 17:31:51      
Finish Time         	In Progress              

Clusters

A CDS cluster is a mechanism to enforce consistency across CDS instances. Repository associations for CDS instances in the same cluster are kept identical by the Pulp server, primarily falling into one of two categories:

A CDS can only belong to one cluster at a time. Clusters do not need to be explicitly created or deleted; clusters are determined by inspecting metadata on CDS instances themselves. A cluster ID must adhere to the standard Pulp guidelines for IDs. Cluster membership can be set at registration time or on an existing CDS through the cds update command.

Example: Cluster membership at registration

$ pulp-admin cds register --hostname cds-1.example.org --cluster_id cluster-1
Successfully registered CDS [cds-1.example.org]

Example: Adding an existing CDS to a cluster

$ pulp-admin cds update --hostname cds-1.example.org --cluster_id cluster-1
Successfully updated CDS [cds-1.example.org]

Example: Removing a CDS from a cluster

$ pulp-admin cds update --hostname cds-1.example.org --remove_cluster
Successfully updated CDS [cds-1.example.org]