Commit 65b4eab9 by Alison Hodges

Merge pull request #4572 from edx/ahodges/documentation/DOC764

Ahodges/documentation/doc764
parents 692e59c3 8e8f7b98
......@@ -12,6 +12,7 @@ This document is intended for researchers and data czars at edX partner institut
internal_data_formats/change_log.rst
internal_data_formats/data_czar.rst
internal_data_formats/credentials.rst
internal_data_formats/package.rst
internal_data_formats/sql_schema.rst
internal_data_formats/discussion_data.rst
internal_data_formats/wiki_data.rst
......
......@@ -11,6 +11,12 @@ Change Log
* - Date
- Change
* - 08/01/14
- Added the :ref:`Package` chapter with information to help data czars
locate and download data package files.
* - 07/10/14
- Added the :ref:`Getting_Credentials_Data_Czar` chapter with information
to help new data czars set up credentials for secure data transfers.
* - 06/27/14
- Made a correction to the ``edx.forum.searched`` event name in the
:ref:`Tracking Logs` chapter.
......
......@@ -31,7 +31,10 @@ files before making them available to a partner institution. As a result, when
you receive a data package (or other files) from the edX Analytics team, you
must decrypt the files that it contains before you use them.
The cryptographic processes of encrypting and decrypting data files require that you create a pair of keys: the public key in the pair is used to encrypt data, and the corresponding private key is used to decrypt any files that have been encrypted with the public key.
The cryptographic processes of encrypting and decrypting data files require
that you create a pair of keys: the public key in the pair is used to encrypt
data, and the corresponding private key is used to decrypt any files that have
been encrypted with the public key.
To create the keys needed for this encryption and decryption process, you use
GNU Privacy Guard (GnuPG or GPG). Essentially, you install a cryptographic
......@@ -180,8 +183,10 @@ contains your email address, your Access Key, and your Secret Key.
.. image:: ../Images/AWS_Credentials.png
:alt: A csv file, open in Notepad, with the Access Key value and the Secret Key value underlined
.. _Access Amazon S3:
****************************************************************
Access Amazon S3 and Download Data Packages
Access Amazon S3
****************************************************************
To connect to Amazon S3, you must have your decrypted credentials. You may want
......@@ -193,29 +198,17 @@ Browser. Alternatively, you can use the `AWS Command Line Interface`_.
#. Select and install a third-party tool or interface to manage your S3
account.
#. Open your decrypted credentials.csv file. This file contains your AWS Access
Key and your AWS Secret Key.
#. Open your decrypted ``credentials.csv`` file. This file contains your AWS
Access Key and your AWS Secret Key.
#. Open the third-party tool. In most tools, you set up information about the
S3 account and then supply your Access Key and your Secret Key to connect to
that account. For more information, refer to the documentation for the tool
that you selected.
#. Access Amazon S3 and navigate to the edX **course-data** bucket. For each
period that a data package is prepared for your organization, two files are
available.
Event tracking data is in a file named {date}-{organization}-tracking.tar.
Database data files are in a file named {organization}-{date}.zip.
#. Download the files. These files can be very large, sometimes several
gigabytes in size.
#. Extract the files from the compressed .tar and the .zip files. All of the
files that you extract are .gpg files.
#. Use your private key to decrypt the .gpg files. See `Decrypt an Encrypted
File`_.
Data package files are in the edX **course-data** and
**edx-course-data** buckets. For information about the files that you
download from Amazon S3, see :ref:`Package`.
.. _AWS Command Line Interface: http://aws.amazon.com/cli/
......@@ -537,7 +537,7 @@ Columns in the student_courseenrollment Table
A row in this table represents a student's enrollment for a particular course run.
note:: A row is created for every student who starts the enrollment process, even if they never complete registration.
.. note:: A row is created for every student who starts the enrollment process, even if they never complete registration.
**History**: As of 20 Aug 2013, this table retains the records of students who unenroll. Records are no longer deleted from this table.
......
......@@ -14,6 +14,8 @@ In the data package, wiki data is delivered in two SQL files:
* The wiki_articlerevision file stores data about the articles, including data about changes and deletions. The full name of this file is in this format: edX-*organization*-*course*-wiki_articlerevision-*source*-analytics.sql.
.. _wiki_article:
***********************************
Fields in the wiki_article file
***********************************
......@@ -94,6 +96,8 @@ other_write
----------------------
Defines whether others have write access to the article. 1 if so, 0 if not.
.. _wiki_articlerevision:
******************************************************
Fields in the wiki_articlerevision file
******************************************************
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment