Commit 8e8f7b98 by Alison Hodges

Getting data package files from AWS

parent 61f9fe56
...@@ -12,6 +12,7 @@ This document is intended for researchers and data czars at edX partner institut ...@@ -12,6 +12,7 @@ This document is intended for researchers and data czars at edX partner institut
internal_data_formats/change_log.rst internal_data_formats/change_log.rst
internal_data_formats/data_czar.rst internal_data_formats/data_czar.rst
internal_data_formats/credentials.rst internal_data_formats/credentials.rst
internal_data_formats/package.rst
internal_data_formats/sql_schema.rst internal_data_formats/sql_schema.rst
internal_data_formats/discussion_data.rst internal_data_formats/discussion_data.rst
internal_data_formats/wiki_data.rst internal_data_formats/wiki_data.rst
......
...@@ -11,6 +11,12 @@ Change Log ...@@ -11,6 +11,12 @@ Change Log
* - Date * - Date
- Change - Change
* - 08/01/14
- Added the :ref:`Package` chapter with information to help data czars
locate and download data package files.
* - 07/10/14
- Added the :ref:`Getting_Credentials_Data_Czar` chapter with information
to help new data czars set up credentials for secure data transfers.
* - 06/27/14 * - 06/27/14
- Made a correction to the ``edx.forum.searched`` event name in the - Made a correction to the ``edx.forum.searched`` event name in the
:ref:`Tracking Logs` chapter. :ref:`Tracking Logs` chapter.
......
...@@ -31,7 +31,10 @@ files before making them available to a partner institution. As a result, when ...@@ -31,7 +31,10 @@ files before making them available to a partner institution. As a result, when
you receive a data package (or other files) from the edX Analytics team, you you receive a data package (or other files) from the edX Analytics team, you
must decrypt the files that it contains before you use them. must decrypt the files that it contains before you use them.
The cryptographic processes of encrypting and decrypting data files require that you create a pair of keys: the public key in the pair is used to encrypt data, and the corresponding private key is used to decrypt any files that have been encrypted with the public key. The cryptographic processes of encrypting and decrypting data files require
that you create a pair of keys: the public key in the pair is used to encrypt
data, and the corresponding private key is used to decrypt any files that have
been encrypted with the public key.
To create the keys needed for this encryption and decryption process, you use To create the keys needed for this encryption and decryption process, you use
GNU Privacy Guard (GnuPG or GPG). Essentially, you install a cryptographic GNU Privacy Guard (GnuPG or GPG). Essentially, you install a cryptographic
...@@ -180,8 +183,10 @@ contains your email address, your Access Key, and your Secret Key. ...@@ -180,8 +183,10 @@ contains your email address, your Access Key, and your Secret Key.
.. image:: ../Images/AWS_Credentials.png .. image:: ../Images/AWS_Credentials.png
:alt: A csv file, open in Notepad, with the Access Key value and the Secret Key value underlined :alt: A csv file, open in Notepad, with the Access Key value and the Secret Key value underlined
.. _Access Amazon S3:
**************************************************************** ****************************************************************
Access Amazon S3 and Download Data Packages Access Amazon S3
**************************************************************** ****************************************************************
To connect to Amazon S3, you must have your decrypted credentials. You may want To connect to Amazon S3, you must have your decrypted credentials. You may want
...@@ -193,29 +198,17 @@ Browser. Alternatively, you can use the `AWS Command Line Interface`_. ...@@ -193,29 +198,17 @@ Browser. Alternatively, you can use the `AWS Command Line Interface`_.
#. Select and install a third-party tool or interface to manage your S3 #. Select and install a third-party tool or interface to manage your S3
account. account.
#. Open your decrypted credentials.csv file. This file contains your AWS Access #. Open your decrypted ``credentials.csv`` file. This file contains your AWS
Key and your AWS Secret Key. Access Key and your AWS Secret Key.
#. Open the third-party tool. In most tools, you set up information about the #. Open the third-party tool. In most tools, you set up information about the
S3 account and then supply your Access Key and your Secret Key to connect to S3 account and then supply your Access Key and your Secret Key to connect to
that account. For more information, refer to the documentation for the tool that account. For more information, refer to the documentation for the tool
that you selected. that you selected.
#. Access Amazon S3 and navigate to the edX **course-data** bucket. For each Data package files are in the edX **course-data** and
period that a data package is prepared for your organization, two files are **edx-course-data** buckets. For information about the files that you
available. download from Amazon S3, see :ref:`Package`.
Event tracking data is in a file named {date}-{organization}-tracking.tar.
Database data files are in a file named {organization}-{date}.zip.
#. Download the files. These files can be very large, sometimes several
gigabytes in size.
#. Extract the files from the compressed .tar and the .zip files. All of the
files that you extract are .gpg files.
#. Use your private key to decrypt the .gpg files. See `Decrypt an Encrypted
File`_.
.. _AWS Command Line Interface: http://aws.amazon.com/cli/ .. _AWS Command Line Interface: http://aws.amazon.com/cli/
...@@ -537,7 +537,7 @@ Columns in the student_courseenrollment Table ...@@ -537,7 +537,7 @@ Columns in the student_courseenrollment Table
A row in this table represents a student's enrollment for a particular course run. A row in this table represents a student's enrollment for a particular course run.
note:: A row is created for every student who starts the enrollment process, even if they never complete registration. .. note:: A row is created for every student who starts the enrollment process, even if they never complete registration.
**History**: As of 20 Aug 2013, this table retains the records of students who unenroll. Records are no longer deleted from this table. **History**: As of 20 Aug 2013, this table retains the records of students who unenroll. Records are no longer deleted from this table.
......
...@@ -14,6 +14,8 @@ In the data package, wiki data is delivered in two SQL files: ...@@ -14,6 +14,8 @@ In the data package, wiki data is delivered in two SQL files:
* The wiki_articlerevision file stores data about the articles, including data about changes and deletions. The full name of this file is in this format: edX-*organization*-*course*-wiki_articlerevision-*source*-analytics.sql. * The wiki_articlerevision file stores data about the articles, including data about changes and deletions. The full name of this file is in this format: edX-*organization*-*course*-wiki_articlerevision-*source*-analytics.sql.
.. _wiki_article:
*********************************** ***********************************
Fields in the wiki_article file Fields in the wiki_article file
*********************************** ***********************************
...@@ -94,6 +96,8 @@ other_write ...@@ -94,6 +96,8 @@ other_write
---------------------- ----------------------
Defines whether others have write access to the article. 1 if so, 0 if not. Defines whether others have write access to the article. 1 if so, 0 if not.
.. _wiki_articlerevision:
****************************************************** ******************************************************
Fields in the wiki_articlerevision file Fields in the wiki_articlerevision file
****************************************************** ******************************************************
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment