README.md 14.9 KB
Newer Older
jarv committed
1
# Configuration Management
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

## Introduction

**This project is currently in alpha**

The goal of the edx/configuration project is to provide a simple, but
flexible, way for anyone to stand up an instance of the edX platform
that is fully configured and ready-to-go.

Building the platform takes place to two phases:

* Infrastruce provisioning
* Service configuration

As much as possible, we have tried to keep a clean distinction between
provisioning and configuration.  You are not obliged to use our tools
and are free to use one, but not the other.  The provisioing phase 
stands-up the required resources and tags them with role identifiers
so that the configuration tool can come in and complete the job.

The reference platform is provisioned using an Amazon
[CloudFormation](http://aws.amazon.com/cloudformation/) template.
When the stack has been fully created you will have a new AWS Virtual
Private Cloud with hosts for the core edX services.  This template
will build quite a number of AWS resources that cost money, so please
consider this before you start.

The configuration phase is manged by [Ansible](http://ansible.cc/).
We have provided a number of playbooks that will configure each of
the edX service.  

This project is a re-write of the current edX provisioning and
configuration tools, we will be migrating features to this project
over time, so expect frequent changes.

jarv committed
37 38
## AWS

John Jarvis committed
39 40 41
### Quick start - Building the stack on a single server


John Jarvis committed
42 43
To deploy the entire edX platform on a single ec2 instance
run the following commands:
John Jarvis committed
44 45 46 47 48 49 50 51

```
git clone git@github.com:edx/configuration
mkvirtualenv ansible
cd configuration
pip install -r ansible-requirements.txt
cd playbooks
(adjust the settings in edx_sandbox.yml)
John Jarvis committed
52
`ansible-playbook  -vvv --user=ubuntu edx_sandbox.yml -i inventory.ini`
John Jarvis committed
53 54 55 56 57 58 59 60 61 62 63 64 65 66
```

This will install the following services on a single instance

* edX lms (django/python) for courseware
* edX studio (django/python) for course authoring
* mysql (running locally)
* mongo (running locally)
* memcache (running locally)

Note: In order for mail to work properly you will need to add AWS credentials for an account that
has SES permissions, see `secure\_example/vars/edxapp\_sandbox.yml`

### Building the stack with CloudFormation
67 68 69 70 71 72 73 74 75 76 77 78 79 80

The first step is to provision the CloudFormation stack.  There are 
several options for doing this.

* The [AWS console](https://console.aws.amazon.com/cloudformation/home)
* The AWS [CloudFormation CLI](http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-installing-cli.html)
* Via Ansible

If you don't have experience with CloudFormation, the web console is a
good place to start because it will use a form wizard to gather
configuration parameters, it will give you continuous feedback during
the process of building the stack and useful error messages when
problems occur.

81 82 83 84 85 86 87 88
Before you create the stack you will need to create a key-pair that can
be used to connect to the stack once it's instantiated.  To do this
go to the 'EC2' section and create a new key-pair under the 'Key Pairs'
section.  Note the name of this key and update
`cloudformation_templates/edx-reference-architecture.json` file.  Under the
'KeyName' section change the value of 'Default' to the name of the key-pair
you just created.

89 90
Details on how to build the stack using Ansible are available below.

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
#### Building with the AWS Console
From the AWS main page that lists all the services you can use.  Click on the
CloudFormation link.  This will take you to a list of cloud stacks you currently
have active.  Here click the 'Create Stack' button.  In the wizard you can give a
name for your stack and pass in a template which defines the edX stack.  Use the
`edx-reference-architecture.json` template in the `cloudformation_templates` directory.

#### Building with the CloudFormation CLI
To build from the CloudFormation CLI you will have to first upload the configuration
file to an S3 Bucket.  The easiest way to do this is to use `s3cmd`.

```
s3cmd put /path/to/edx-reference-architecture.json s3://<bucket_name>
aws cloudformation create-stack --stack-name <stack_name> --template-url https://s3.amazonaws.com/<bucket_name>/edx-reference-architecture.json --capabilities CAPABILITY_IAM
```
106 107 108 109 110 111 112 113 114 115

### Post Bringup Manual Commands

Unfortunately there is some infrastructure that we need that is currently not supported
by CloudFormation.  So once your stack is created by CloudFormation you need to run
a few manual commands to fill in those gaps.

This requires that you've installed the command line utilities for [ElastiCache][cachecli]
and [EC2][ec2cli].  Note that we requrire at least version 1.8 of the ElastiCache CLI due
to some newer commands that we rely on.
116

117 118 119 120
  [cachecli]: http://aws.amazon.com/developertools/2310261897259567
  [ec2cli]: http://aws.amazon.com/developertools/351

At the end of the CloudFormation run you should check the "Outputs" tab in
121 122
Amazon UI and that will have the commands you need to run (see
screenshot).
123

124
![CloudFormation Output (Amazon console)](doc/cfn-output-example.png)
125

126
Run the commands shown here before moving onto the next step.
127

128 129 130 131 132 133 134 135 136 137 138
### Connecting to Hosts in the Stack

Because the reference architecture makes use of an Amazon VPC, you will not be able
to address the hosts in the private subnets directly.  However, you can easily set 
up a transparent "jumpbox" so that for all hosts in your vpc, connections are 
tunneled.

Add something like the following to your `~/.ssh/config` file.

```
Host *.us-west-1.compute-internal
139
  ProxyCommand ssh -W %h:%p vpc-us-west-1-jumpbox
140 141
  ForwardAgent yes

142 143
Host vpc-us-west-1-jumpbox
  HostName 54.236.202.101
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
  ForwardAgent yes
```

This assumes that you only have one VPC in the ```us-west-1``` region
that you're trying to ssh into.  Internal DNS names aren't qualified
any further than that, so to support multiple VPC's you'd have to get
creative with subnets, for example ip-10-1 and ip-10-2...

Test this by typing `ssh ip-10-0-10-1.us-west-1.compute.internal`, 
(of course using a hostname exists in your environment.)  If things 
are configured correctly you will ssh to 10.0.10.1, jumping 
transparently via your basion host.

Getting this working in important because we'll be using Ansible
with the SSH transport and it will rely on this configuration
being in place in order to configure your servers.

161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180

### Finding your hosts via boto

Boto is how fabric looks up metadata about your stack, most importantly
finding the names of your machines.  It needs your access information.
This should be the contents of your ```~/.boto``` file.  Make sure
to customize the region:

```ini
[Credentials]
aws_access_key_id = AAAAAAAAAAAAAAAAAAAA
aws_secret_access_key = BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

[Boto]
debug = 1
ec2_region_name = us-west-1
ec2_region_endpoint = ec2.us-west-1.amazonaws.com
```


jarv committed
181 182
### Tagging

183 184
Tagging is the bridge between the provisioning and configuration
phases.  The servers provisioned in your VPC will be stock Ubuntu
185
12.0.4 LTS servers.  The only difference between them will be the tags
186 187 188 189 190 191 192 193
that CloudFront has applied to them.  These tags will be used by Ansible
to map playbooks to the correct servers.  The application of the
appropriate playbook, will turn each stock host into an appropriately
configured service.

The *Group* tag is where the magic happens.  Every AWS EC2 instance
will have a *Group* tag that corresponds to a group of machines that
need to be deployed/targeted to as a group of servers.
jarv committed
194 195 196 197 198 199

**Example:**
* `Group`: `edxapp_stage`
* `Group`: `edxapp_prod`
* `Group`: `edxapp_some_other_environment`
 
Joe Blaylock committed
200 201
Additional tags can be added to AWS resources in the stack but they should not
be made necessary deployment or configuration.
jarv committed
202 203 204

## Ansible

Joe Blaylock committed
205 206
Ansible is a configuration management tool that edX is evaluating to replace
the puppet environment that is currently being used for edX servers.
jarv committed
207

jarv committed
208 209
http://ansible.cc/docs

Joe Blaylock committed
210 211
_Note: Because the directory structure changes in v1.2 we are using the dev
version instead of the official v1.1 release._
jarv committed
212 213


Joe Blaylock committed
214 215 216 217 218 219
* __Hosts__ -  The ec2.py inventory script generates an inventory file where
  hosts are assigned to groups. Individual hosts can be targeted by the "Name"
  tag or the instance ID. I don't think there will be a reason to set host
  specific variables.
* __Groups__ - A Group name is an identifier that corresponds to a group of
  roles plus an identifier for the environment.  Example: *edxapp_stage*,
220
  *edxapp_prod*, *xserver_stage*, etc.  For the purpose of targeting servers
Joe Blaylock committed
221
  for deployment groups are created automatically by the `ec2.py` inventory
222
  script since these group names will map to the _Group_ AWS tag. 
Joe Blaylock committed
223 224
* __Roles__  - A role will map to a single function/service that runs on
  server.
jarv committed
225

jarv committed
226
## Organization
jarv committed
227

228 229 230 231 232
### Secure vs. Insecure data

As a general policy we want to protect the following data:

* Usernames
233
* Public keys (keys are OK to be public, but can be used to figure out usernames)
234
* Hostnames
235
* Passwords, API keys
236

John Kern committed
237
The following yml files and examples serve as templates that should be overridden with your own
238 239
environment specific configuration:

John Jarvis committed
240 241
* vars in `secure_example/vars` 
* files in `secure_example/files` 
242

243
Directory structure for the secure repository:
244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263

```

ansible
├── files
├── keys
└── vars

```

The same directory structure, required yml files and files are 
in the secure_example dir:

```
secure_example/
├── files
├── keys
└── vars
```

John Kern committed
264
The default `secure_dir` is set in `group_vars/all` and can be overridden by
265 266 267
adding another file in group_vars that corresponds to a deploy group name.


jarv committed
268 269
The directory structure should follow Ansible best practices.

jarv committed
270 271
http://ansible.cc/docs/bestpractices.html

Joe Blaylock committed
272 273
* At the top level there are yml files for every group where a group name is an
  identifier that corresponds to a set of roles plus an environment.  
jarv committed
274
* The standard environments are _stage_ and _production_.
Joe Blaylock committed
275 276
* Additional environments can be named as well, below an example is given
  called _custom_.
jarv committed
277 278 279 280


### Variables

Joe Blaylock committed
281 282 283 284 285 286 287 288 289
* The ansible.cfg that is checked into the playbook directory has hash merging
  turned on, this allows us to to merge secure and custom data into the default
  variable definitions for every role.
* For example, `vars/lms_vars.yml` (variables needed for the lms role) sets the
  `env_config` which has keys that can be overridden by
  `vars/secure/edxapp_stage_vars.yml` for setting passwords and hostnames.  
* If needed, additional configuration can be layered, in the example
  `vars/secure/custom_vars.yml` changes some paramters that are set in
  `vars/secure/edxapp_stage_vars.yml`.
jarv committed
290

Joe Blaylock committed
291 292 293
__TODO__: _The secure/ directories are checked into the public repo for now as an
example, these will need to be moved to a private repo or maintained outside of
github._
jarv committed
294 295 296 297 298

### Users and Groups

There are two classes of users, admins and environment users.

Joe Blaylock committed
299 300 301 302
* The *admin_users* hash will be added to every server and will be put into a
  group that has admin bits.
* The *env_users* hash are the class of users that can be optionally included
  in one of the group-environment playbooks.
jarv committed
303 304


Joe Blaylock committed
305
Example users are in the `vars/secure` directory:
jarv committed
306

Joe Blaylock committed
307 308
* [*env_users* for staging environment](/vars/secure/edxapp_stage_users.yml)
* [*admin_users* will be realized on every server](/vars/secure/users.yml)
jarv committed
309

Joe Blaylock committed
310

311 312
```
cloudformation_templates  <-- official edX cloudformation templates
Jason Bau committed
313 314 315
    └── examples          <-- example templates
playbooks
 └──
Joe Blaylock committed
316 317 318
     edxapp_prod.yml      <-- example production environment playbook
     edxapp_stage.yml     <-- example stage environment playbook
     edxapp_custom.yml    <-- example custom environment playbook
319 320 321 322 323 324 325 326 327
    ├── files             <-- edX cloudformation templates
    │   └── examples      <-- example cloudformation templates
    ├── group_vars        <-- var files that correspond to ansible group names (mapped to AWS tags)
    ├── keys              <-- public keys
    ├── roles             <-- edX services
    │   ├── common        <-- tasks that are run for all roles
    │   │   └── tasks
    │   ├── lms
    │   │   ├── tasks     <-- tasks that are run to setup an LMS
328 329 330
    │   │   ├── templates
    │   │   └── vars      <-- main.yml in this directory is auto-loaded when the role is included
    │   │
331
    │   └── nginx
332
    │       ├── handlers 
333
    │       ├── tasks
334 335 336
    │       ├── vars
    │       └── templates 
    │   (etc)
337
    └── vars             <-- public variable definitions
Jason Bau committed
338
    └── secure_example   <-- secure variables (example)
339 340

```
jarv committed
341

jarv committed
342

jarv committed
343 344 345 346 347
### Installation

```
  mkvirtualenv ansible
  pip install -r ansible-requirements.txt
348
  util/sync_hooks.sh
jarv committed
349 350
```

jarv committed
351
### Launching example cloudformation stack - Working example
jarv committed
352

jarv committed
353
#### Provision the stack
jarv committed
354

355 356
**This assumes that you have workng ssh as described above**

jarv committed
357 358
  ```
  cd playbooks
Joe Blaylock committed
359
  ansible-playbook  -vvv cloudformation.yml -i inventory.ini  -e 'region=<aws_region> key=<key_name> name=<stack_name> group=<group_name>'
jarv committed
360
  ```
jarv committed
361
  
362
* _aws_region_: example: `us-east-1`. Which AWS EC2 region to build stack in.
Joe Blaylock committed
363 364 365 366
* _key_name_: example: `deploy`. SSH key name configured in AWS for the region
* _stack_name_: example: `EdxAppCustom`. Name of the stack, must not contain
  underscores or cloudformation will complain. Must be an unused name or
  otherwise the existing stack will update.
John Kern committed
367
* _group_name_: example: `edxapp_stage`. The group name should correspond to
Joe Blaylock committed
368 369 370
  one of the yml files in the `playbooks/`. Used for grouping hosts.

While this is running you see the cloudformation events in the AWS console as
Joe Blaylock committed
371
the stack is brought up.  Loads the `playbooks/cloudformation.yml` template
Joe Blaylock committed
372
which creates a single small EBS backed EC2 instance.  
jarv committed
373

Joe Blaylock committed
374 375 376
_Note: You should read the output from ansible and not necessarily trust the
'ok'; failures in cloudformation provisioning (for example, in creating the
security group), may not cause ansible-playbook to fail._
jarv committed
377

Joe Blaylock committed
378 379
See files/examples for
adding other components to the stack.
jarv committed
380

Joe Blaylock committed
381
##### If ansible-playbook gives you import errors
jarv committed
382

Joe Blaylock committed
383 384 385 386 387 388 389
Ansible really wants to call /usr/bin/python and if you have good virtualenv
hygeine, this may lead to ansible being unable to import critical libraries
like cloudfront. If you run into this problem, try exporting PYTHONPATH inside
your virtualenv and see if it runs better that way. E.g.:

  ```
  export PYTHONPATH=$VIRTUAL_ENV/lib/python2.7/site-packages/ 
Joe Blaylock committed
390
  ansible-playbook playbooks/cloudformation.yml -i playbooks/inventory.ini
Joe Blaylock committed
391 392 393 394
  ```

If that works fine, then you can add an export of PYTHONPATH to
`$VIRTUAL_ENV/bin/postactivate` so that you no longer have to think about it.
jarv committed
395
  
jarv committed
396
### Configure the stack
jarv committed
397

jarv committed
398 399 400
* Creates admin and env users
* Creates base directories
* Creates the lms json configuration files
jarv committed
401

e0d committed
402 403 404 405 406

Assuming that the edxapp_stage.yml playbook targets hosts in your vpc
for which there are entiries in your `.ssh/config`, do the 
following to run your playbook.

jarv committed
407
```
jarv committed
408
  cd playbooks
e0d committed
409
  ansible-playbook -v --user=ubuntu edxapp_stage.yml -i ./ec2.py -c ssh
jarv committed
410 411 412 413
```

*Note: this assumes the group used for the edx stack was "edxapp_stage"*

414