Commit ae5c897e by Alison Hodges

Completed edits for Sylvia's review. Added data czar to glossary

parent d2deecc4
......@@ -143,6 +143,18 @@ C
D
****
.. _Data Czar_g:
**Data Czar**
A data czar is the single representative at a partner institution who is
responsible for receiving course data from edX, and transferring it securely
to researchers and other interested parties after it is received.
See `edX Research Guide`_.
.. _edX Research Guide: http://edx.readthedocs.org/projects/devdata/en/latest/
.. _Discussion Forum:
**Discussion Forum**
......
......@@ -8,10 +8,10 @@ EdX transfers course data to the data czars at our partner institutions in
regularly generated data packages. Data packages can only be accessed by a
single contact at each university, referred to as the "data czar".
The data czar who is selected at each institution sets up encryption "keys"
for securely transferring files from edX to the partner institution. Meanwhile,
the Analytics team at edX sets up credentials so that the data czar can log in
to the site where data packages are stored.
The data czar who is selected at each institution sets up keys for securely
transferring files from edX to the partner institution. Meanwhile, the
Analytics team at edX sets up credentials so that the data czar can log in to
the site where data packages are stored.
.. image:: ../Images/Data_Czar_Initialization.png
:alt: Flowchart of data czar creating public and private keys and sending the
......@@ -20,31 +20,31 @@ to the site where data packages are stored.
the data czar
After these steps for setting up credentials are complete, the data czar can
download data packages.
download data packages on an ongoing basis.
****************************************************************
Data Czar: Create Keys for Encryption and Decryption
****************************************************************
To assure the security of data packages, the edX Analytics team encrypts all
files before transferring them to a partner institution. As a result, when you
receive a data package (or any other file from the edX Analytics team), you must
decrypt the data before it can be used in any way.
files before making them available to a partner institution. As a result, when
you receive a data package (or other files) from the edX Analytics team, you
must decrypt the files that it contains before you use them.
The cryptograhpic processes of encrypting and decrypting data files require that you create a pair of keys: the public key in the pair is used to encrypt data, and the corresponding private key is used to decrypt any files that have been encrypted with the public key.
To create the keys needed for this encryption and decryption process, you use
GNU Privacy Guard (GnuPG or GPG). Essentially, you install a cryptographic
application on your local computer and supply your email address and a secret
passphrase (a password). The application uses this information to create both a
private key for you to use for *decrypting* files from edX and also the unique
public key that you send to edX to use in *encrypting* your data packages and
files. Each data czar creates his or her own private and public key pair to use
with edX files.
passphrase (a password).
.. note:: The email address that you supply when you create your keys must be your official email address at your edX partner institution.
Creating these keys is a one-time process that you coordinate with your edX
program manager. Instructions for creating the keys on Windows or Macintosh
follow.
The result is the public key that you send to edX to use in encrypting data
files for your institution, and the private key which you keep secret and use
to decrypt the encrypted files that you receive. Creating these keys is a one-
time process that you coordinate with your edX program manager. Instructions
for creating the keys on Windows or Macintosh follow.
For more information about GPG encryption and creating key pairs, see the
`Gpg4win Compendium`_.
......@@ -94,7 +94,7 @@ Create Keys: Macintosh
#. When the download is complete, click the .dmg file to begin the
installation.
#. When installation is complete, GPG Keychain Access opens a web page with
When installation is complete, GPG Keychain Access opens a web page with
`First Steps`_ and a dialog box.
#. Enter your name and email address. Be sure to enter your official university
......@@ -111,35 +111,39 @@ Create Keys: Macintosh
a. Specify a file name and location to save the file.
b. Make sure that **Format** is ASCII.
b. Make sure that **Format** is set to ASCII.
c. Make sure that **Allow secret key export** is cleared.
When you click **Save**, only the public key is saved in the resulting .asc
file. Do not share your private key with edX or any third party.
#. Compose an e-mail message to your edX program manager. Attach the .asc
file that you saved in the previous step to the message then send the
file that you saved in the previous step to the message, then send the
message.
.. _GPG Tools: https://gpgtools.org/
.. _First Steps: http://support.gpgtools.org/kb/how-to/first-steps-where-do-i-start-where-do-i-begin#setupkey
****************************************************************
edX: Create and Deliver Credentials for Accessing Data Storage
EdX: Deliver Credentials for Accessing Data Storage
****************************************************************
The data packages that edX prepares for each partner organization are uploaded
to the Amazon Web Service (AWS) Simple Storage Service (S3). The edX Analytics
team creates an individual account to access this storage service for each data
czar. The credentials for accessing this account are called an Access Key
and a Secret Key.
to the Amazon Web Service (AWS) Simple Storage Service (Amazon S3). The edX
Analytics team creates an individual account to access this storage service for
each data czar. The credentials for accessing this account are called an Access
Key and a Secret Key.
After the edX Analytics team creates these access credentials for you, they are
encrypted (using the public encryption key that you sent your program manager)
into a **credentials.csv.gpg** file. This file is then sent to you as an email
attachment.
After the edX Analytics team creates these access credentials for you, they use
the public encryption key that you sent your program manager to encrypt the
credentials into a **credentials.csv.gpg** file. The edX Analytics team then
sends the file to you as an email attachment.
The **credentials.csv.gpg** file is likely to be the first file that you
decrypt with your private GPG key. You use the same process to decrypt the data
package files that you retrieve from Amazon S3.
package files that you retrieve from Amazon S3. See `Decrypt an Encrypted
File`_.
.. image:: ../Images/Access_AmazonS3.png
:alt: Flowchart of edX collecting files for the data package and then
......@@ -149,9 +153,9 @@ package files that you retrieve from Amazon S3.
.. _Decrypt an Encrypted File:
==========================================
****************************************************************
Decrypt an Encrypted File
==========================================
****************************************************************
To work with an encrypted .gpg file, you use the same GNU Privacy Guard program
that you used to create your public/private key pair. You use your private key
......@@ -161,24 +165,24 @@ to decrypt the Amazon S3 credentials file and the files in your data packages.
#. On a Windows computer, open Windows Explorer. On a Macintosh, open Finder.
#. Navigate to the file and right-click on it.
#. Navigate to the file and right-click it.
#. On a Windows computer, select **Decrypt and verify** and then click
**Decrypt/Verify**. On a Macintosh, select **Services** and then click
#. On a Windows computer, select **Decrypt and verify**, then click
**Decrypt/Verify**. On a Macintosh, select **Services**, then click
**OpenPGP: Decrypt File**.
#. Enter your passphrase. The GNU Privacy Guard program decrypts the file.
For example, when you decrypt the credentials.csv.gpg file the result is a
credentials.csv file. When you open the credentials.csv file it contains your
email address, your Access Key, and your Secret Key.
credentials.csv file. Open the decrypted credentials.csv file to see that it
contains your email address, your Access Key, and your Secret Key.
.. image:: ../Images/AWS_Credentials.png
:alt: A csv file, open in Notepad, with the access key value and the secret key value underlined
:alt: A csv file, open in Notepad, with the Access Key value and the Secret Key value underlined
============================================
****************************************************************
Access Amazon S3 and Download Data Packages
============================================
****************************************************************
To connect to Amazon S3, you must have your decrypted credentials. You may want
to have a third-party tool that gives you a user interface for managing files
......@@ -204,7 +208,7 @@ Browser. Alternatively, you can use the `AWS Command Line Interface`_.
Event tracking data is in a file named {date}-{organization}-tracking.tar.
Database data files are in a file named {organization}-{date}.zip.
#. Download the files. These files can become very large, sometimes several
#. Download the files. These files can be very large, sometimes several
gigabytes in size.
#. Extract the files from the compressed .tar and the .zip files. All of the
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment