Back to Help Menu

Frequently Asked Questions

General

  • What is DocumentCloud? DocumentCloud is a web-based software platform for organizing, researching, annotating, analyzing, and publishing primary source documents. We offer a set of tools that help you find and tell stories in your documents.

  • How much does it cost? Thanks to generous funding from the Knight Foundation, DocumentCloud since its launch in 2010 has been offered for free exclusively to verified journalism organizations. We are currently developing a paid model to ensure the DocumentCloud platform’s sustainability.

  • Who can have accounts? Anyone may have a DocumentCloud/MuckRock account to view documents, but if you are seeking to upload documents, you need to request account verification.

  • Where can I get help? You can find answers to most questions you have either in this FAQ, one of our other help pages or on our YouTube channel. If you can’t find the answer or if you’re having trouble using DocumentCloud, please email us at info@documentcloud.org

  • Where can I read your Terms of Service? The current Terms of Service for DocumentCloud and MuckRock are viewable on MuckRock's TOS page.

Privacy

  • What is DocumentCloud’s privacy policy? Please read our complete Privacy Policy for details.

  • Who can see documents in my account? By default, any document you upload is set to “private” access and is viewable only by you. If you set the access level to “private to [your organization name],” then other DocumentCloud users in your organization can view the document, but not the public. If you set it to "public", then all users whether registered with a DocumentCloud account or not may view your documents.

  • Is metadata about uploaded files preserved? If you are uploading a PDF, then yes, by default the metadata such as authorship, creation date, etc for the document is preserved. However, if you select to perform OCR on the document by selecting "Force OCR" or you redact the document, the document's underlying metadata will be wiped. If this is a concern to you, you may want to keep an original copy of the document(s). You can do so using the PDF Exporter Add-On or downloading documents individually. Files that are not PDFs are converted into PDFs using LibreOffice and therefore do not preserve metadata by default.

Accounts

  • Do I have to use my real name for my account? Yes, the DocumentCloud Terms of Service require that accounts use real names and valid email addresses. However, we allow organizations to create one shared account for posting documents and another shared account for use with automation technology or our API. These accounts should have an appropriate name, such as “[Organization name] Documents.”

  • How do I get an account? Sign up for an account at the plan selection plan. To begin uploading documents to DocumentCloud, however, you must request account verification.

  • Can I have more than one account with the same email? No, you cannot have more than one account with the same email. You can have one account with multiple emails and you can have one account that belongs to multiple organizations, however one organization will be the default. We recommend one individual has one account with all of their emails and organizations associated with that singular account.

  • How do I log in and log out? To log in, on the DocumentCloud home page, enter your account email address and password in the login box. To log out, click the “Log Out” link at the top right of the workspace. Note: Always log out when you’re done working if you’re logged in on a shared computer.

  • How do I reset my password? If you’ve forgotten your password, go to our password reset page. If you are not receiving your password reset link, please check your Spam folder. If you are still having trouble resetting your password, email us at info@documentcloud.org and we will send you the password reset link.

  • How do I change my email address associated with my account? To change your email address, visit our e-mail update page

  • I lost access to the email tied to my DocumentCloud account and no longer have the password to login. What can I do? Email us at info@documentcloud.org and let us know the email associated with the DocumentCloud account. We can change the email tied to your DocumentCloud account for you and then you may proceed to reset the password accordingly.

  • Can I delete my account? Accounts cannot be deleted, but they can be disabled by another user within your organization who has administrator-level privileges. We (and the public) value the documents you uploaded and made public and are glad to continue to host them.

Organizations

  • How does DocumentCloud organize accounts? Currently, we create accounts under the umbrella of an organization. That is, each user account is tied to at least one organization. This allows, for example, users within that organization to collaborate privately on documents.

  • How do I remove users from my organization? If you're an administrator for the organization, you can manage members by going to the main account management page and clicking on the organization name you want to change. Then, click "Manage Members".

  • How do I add users to my organization? If you're an administrator for the organization, the easiest way to add additional users is to send the users the following instructions:

    • Register for a free MuckRock account.
    • Click “Request to Join” from the organization page, which should look something like this: https://accounts.muckrock.com/organizations/daily-bugle/
    • Note: Some organizations run into a temporary limit of five users — to raise this limit, click “Upgrade,” leave the plan on free, and put in a larger number of users, and then click “Update.” You can still have an unlimited number of users with a free account, and we’re working on improving the flow of adding users in a future update.
  • How can I get accounts for others in my organization? Anyone in your organization who has a DocumentCloud account with administrator privileges can send invite links to add users to a DocumentCloud/MuckRock organization. Check around your organization- if you are not sure who has administrator privileges or the admin has since left the organization, please email us at info@documentcloud.org for additional support.

  • How many user accounts can an organization have under its account? Currently, there is no limit.

  • What happens if my organization closes? Please notify DocumentCloud by email at info@documentcloud.org if your organization is closing, changing its name or experiencing another significant change. We value the documents uploaded and made public by our contributors and ask that you do not delete them. We will be glad to work with you on a transition plan. If the current holder of the documents is not a part of the new organization, you may have to do two transfers where the documents are transferred from User A from Organization A -> User B who is part of Organization A & B -> User C who is a member of Organization B.

  • What happens to my documents if my organization closes? Documents that are owned by an organization can be downloaded locally using the PDF Exporter Add-On that is enabled by default. Organizational administrators may opt to have the ownership of documents be transferred to users within the organization by selecting the documents, clicking "Edit" and then "Change Owner" or using the "Move Account" Add-On to transfer documents to another user or organization. Note: The Move Account Add-On requires that the account transferring the documents over to a new organization must be a member of that organization.

  • What happens to my DocumentCloud account if my organization closes? You retain your DocumentCloud account and can change the email address associated with the account to that of another organization or your personal email address.

  • What happens to my documents and account if I leave my current organization? You are able at any time to download the original documents you uploaded to our service. You may use the PDF Exporter Add-On, which is enabled by default, to download documents locally. Generally, we defer to each organization regarding disposition of the documents you uploaded to DocumentCloud while in their employment. Every organization has its own rules governing ownership of material generated while in their employment or service.

  • Our organization is changing its name. How do we change our name on MuckRock/DocumentCloud? Please email us at info@documentcloud.org and we would be more than happy to change your organization's name in our system.

  • Our organization is being purchased by a larger organization or merged with another team. How do we transfer the ownership of our documents over to the new team? You may transfer documents in batches of 25 at a time by selecting the documents, clicking on "Edit" -> "Change Owner" and selecting the new organization's name as well as a user in that organization to transfer the documents to. You may also use the "Move Account" Add-On to transfer large sets of documents over to the other organization. Note: The Move Account Add-On requires that the account transferring the documents over to a new organization must be a member of that organization. If the current holder of the documents is not a part of the new organization, you may have to do two transfers where the documents are transferred from User A from Organization A -> User B who is part of Organization A & B -> User C who is a member of Organization B.

Verification

  • How do I verify my account? If you are part of an organization, you should first check that your organization does not already exist as an entity on MuckRock & DocumentCloud.

    If you do find your organization, you can click "Request to join" from the left sidebar of the organization or contact an administrator to add you directly. Members of organizations that have already been verified do not need to independently be verified.

    If your organization does not exist on MuckRock & DocumentCloud services already, you can create an organization.

    If you are an established freelancer, you can skip searching for your organization or creating an organization on MuckRock and DocumentCloud entirely.

    If you are a freelancer or part of a new organization that needs verification, you can find the "Request Verification to Upload" button to the right of the Upload button when you log into DocumentCloud. If you have joined a verified organization, you won't see this button as you do not need to be verified individually as a member of a verified organization.

DocumentCloud Premium

  • What is DocumentCloud Premium? DocumentCloud premium features are available to both paid professional and organizational accounts on MuckRock. DocumentCloud premium features include access to AI credits to perform advanced analysis on documents, access to Amazon's Textract OCR engine, and a growing feature list. For a full feature list, read more at the DocumentCloud Premium page. Paid professional and organizational accounts on MuckRock also gain access to monthly request credits on MuckRock, the ability to embargo requests, and bulk purchasing rates. Upgrading your account is done by visiting the plan selection page.
  • What can I find in the public catalog? We feature more than a million documents provided by contributors ranging from The New York Times to The Guardian and hundreds of large and small news organizations, freelance journalists and others who report using public documents.

  • Do I need an account to search the public catalog? No, you do not. We are proud to provide a valuable public resource at no cost.

  • Will getting a DocumentCloud account allow me to see more documents? No, whether you have an account or not, the only documents available for viewing are those explicitly shared by the users who upload them. The only exception is if you get added to a DocumentCloud/MuckRock organization that has documents set with permissions "private to your organization" - then you will get access to those documents as well.

  • How can I search documents? Using the search bar in the workspace, type the text you’d like to find in documents or search by attributes including Title, User, Project, Organization, Access, and more. Learn more in our documentation on searching

  • Can I contribute documents to your public catalog? To start contributing, register for an account. You will need to go through account verification to being uploading documents.

Uploading

  • What kinds of file types can I upload? The most common file type our users upload is PDF, but DocumentCloud can also convert over 70 file types into PDFs. This includes Word documents, Excel spreadsheets, PowerPoint presentations, HTML and image files. We cannot process video, audio or closed-format files such as Outlook PST files.

  • Is there a limit on the size of a file I can upload? Yes, 500 MB is the largest file you can upload. If your file is larger than 500MB, you can try to use the PDF Compression DocumentCloud Add-On to compress the document before upload. If it is still too large, you will need to split the document up into smaller files.

  • Are there restrictions on the content of documents I can upload? DocumentCloud is intended to be a repository of public documents. Our Terms of Service prohibits uploading copyrighted material that is not yours.

  • How long does it take to process a document? Processing times will vary depending on the size of the file and whether or not it needs to be OCR'd. Small documents are usually processed in a minute or less; larger documents or large sets of documents might take slightly longer.

  • What does DocumentCloud do with the documents I upload? When you upload a document, we save the original file. We extract images of each page in several sizes for our workspace and embeds. If there is a text layer embedded in the document, we retrieve that and store it in a database for searching. If there isn’t, we OCR the document to capture text.

  • When I upload a document, can anyone else see it? By default, documents are set to private access upon upload. That means only you can see it. You have the option of setting access to “private to your organization,” meaning anyone else with a DocumentCloud account in your organization can see it, or “public,” meaning it’s viewable by everyone.

  • Can I OCR my documents in languages other than English? Yes, we offer OCR in over 90 languages via the Tesseract OCR engine’s language packs. Select the document you want to OCR in the search viewer, click "Edit" -> "Force Reprocess" -> "Force OCR" and select the appropriate OCR language.

  • I don’t see the language I need for OCR. Can you add it? We are glad to add languages supported by Tesseract, the OCR engine we use. Please see the list of languages supported by Tesseract and contact us by email at info@documentcloud.org to discuss your needs.

  • What do I do when my documents are stuck in processing? Documents should almost never take more than a minute to process, but if they do there are a few steps you can take. First, try refreshing the page. Sometimes a document has processed, but it didn’t let your browser know for some reason. If you see this regularly, let us know, and include your browser and operating system. Second, try uploading a new copy of the document. This is a short term fix but sometimes just re-uploading will fix the issue. Finally, if the above doesn't work and a document has been processing for more than five minutes, please get in touch. If possible, include a sample of the file you were uploading, anything special that might have been related (such as uploading a large number of documents, changing the status of documents, very large documents), and the browser and operating system you were using.

  • What do I do if I have a lot of documents that have failed to upload? If you filter your uploads by typing status: in the search bar and select error or nofile and notice a lot of documents that did not upload correctly, you can delete them by running the Clear Failed Uploads Add-On. This will allow you to clear all failed uploads without having to delete them 25 documents at a time. If you are uploading large sets of documents, it is encouraged to use the DocumentCloud batch upload script to avoid receiving a lot of errors during processing. The Batch upload script includes a flag --reupload_errors that you can use to go through the documents that have failed to upload the first time and re-attempt. The script also keeps track of your uploads in a database file which you can use DB Browser to view the tables in the database, filter by failed uploads or run queries against the database.

Working with Documents

  • What information can I add to my documents? You can add several pieces of information either before or after uploading. These include a source, description, published URL and related article URL. To access these fields after you’ve uploaded a document, select the document and choose “Edit Document Information.”

  • What is the difference between Related Article URL and Published URL? Use the Related Article URL to tell readers the location of the article that uses this document as source material. Adding a URL in this field creates a Related Article link in the sidebar of the full viewer. The Published URL is the page where the document is embedded. If a document might be accessed at more than one URL, however, you can specify the URL we should send users to if they find the document through a search of DocumentCloud.

  • How can I add custom data (tags) to organize and search my documents? DocumentCloud allows you to define and search your own set of custom data (key/value pairs) associated with specific documents. To edit data for individual documents in the workspace, select the documents you wish to update, and choose “Edit Document Data” from the “Edit” menu. See “Filter Fields” in our search documentation, specifically data_ and tag: search fields to learn more.

  • How do I change the order of pages in a document I uploaded? Click on the document to open it in the document workspace. In the sidebar, click “Modify Pages” You’ll see thumbnails of all the pages in your document. Select the pages you would like to move, then select "Move".

  • How do I insert or replace pages in a document I uploaded? To insert one or more pages:

    • Click on the document to open it in the document workspace.
    • In the sidebar, click “Modify Pages” You’ll see thumbnails of all the pages in your document.
    • To insert new pages at a specific position within the document, click between the pages you'd like the pages to be inserted and select "Insert from other document"
    • When you’re ready, click the “Apply Modifications” button.
  • How do I remove pages from a document I uploaded? We recommend you retain a backup of your document before removing pages. To remove one or more pages:

    • Click on the document to open it in the document workspace. In the sidebar, click “Remove Pages.” You’ll see thumbnails of all the pages in your document.
    • Select the pages you’d like to delete from your document, and then click “Modify Pages”, select the pages you'd like removed, and then click "Remove" and then when you are certain you'd like those pages to be removed select "Apply modifications"
    • Note that once you remove pages they are permanently deleted and your original document is replaced.
  • How do I redact portions of my document? We recommend you retain a backup of your document before making redactions. To redact a portion of a document:

    • Click on the document to open it in the document workspace. In the sidebar, click “Redact Document.”
    • Click and drag to draw a black rectangle over each portion of the document you’d like to redact. (You can redact more than one section at a time.)
    • When finished, click “Save Redactions.”
  • Does redacting a document also remove the text extracted from it? When you redact a portion of a document, we erase all data related to the redacted information, create a new redacted document, and delete the original document. Any text that was part of the redacted portion is deleted.

  • Can I change the orientation of a page in a document? Yes, click on the document, click "Modify Pages", select the page(s) you'd like to rotate, select "Rotate" until you have achieved the desired orientation, and then hit "Apply Modifications"

  • How do I delete documents? To delete an entire document, select the document by clicking the check mark next to the document from the search menu, click on the "Edit" button and then "Delete".

  • Once I delete a document, can I get it back? No, once you delete a document it’s permanently deleted from our platform.

Analyzing Data in Documents

  • How can I see entities extracted from my documents? Select a document in the workspace. Under the “Edit” menu, select “Entities” and select "Extract entities"

Working With OCR and Document Text

  • What kind of OCR software does DocumentCloud use? We use Tesseract, an open-source OCR engine. Google currently sponsors development. DocumentCloud Premium users also have access to Amazon's Textract OCR which performs much better on scanned documents, handwritten text, and table extraction.

  • Do you OCR every document I upload? No. If your document contains embedded text and you have not selected the Force OCR option, we save the underlying text in our database. We use OCR when there’s no text layer. You may force a document to be OCR'd by selecting the document with the check mark in the search view, clicking "Edit" -> "Force Reprocess" -> "Force OCR" and selecting the appropriate language for the document. Note, that if you Force OCR this way, it will run OCR on the entire document, not only the portions of the document that do not have a text layer.

  • If I run OCR on a document when I upload it to DocumentCloud, can I recover the original underlying text layer? Forcing OCR means the original text layer is lost. Force re-processing it will not recover the lost text layer. You will need to re-upload the original document. If you do not select force OCR, the text layer remains intact.

  • Can I OCR a document even if it has text embedded in it? Yes. Double-click the document to open it in the document workspace. In the sidebar, click “Reprocess Text.” In the dialog, click “Force OCR.”

  • How do I download all the text from a document? Open the document, in the bottom right hand corner toggle the drop-down menu to "Plain Text", from there you can copy & paste the plaintext from the document.

Annotating

  • What are notes? In DocumentCloud, notes are a way to highlight important sections of documents with a short headline and explanatory text. Notes can either be private — viewable only by you, Collaborator- meaning anyone who has been added as a collaborator on the document can view it — or public, meaning anyone who has view access to the document can see the annotations.

  • How do I add a note? To add a note:

    • Click on the document to open it in the document workspace. In the sidebar, click "Add Note""
    • Drag your cursor to draw a box over the area of the document you want to highlight.
    • When you release your cursor, you’ll see a dialog box that lets you add a short headline and some explanatory text. It will also show you the access control restrictions (public, collaborator, private) when you create the note.
    • When done, click “Save.”
  • How do I edit an existing note? Find the note on your document and select it. Click the pencil icon to the right of the headline to edit the note.

  • What is the difference between public and private notes? Public notes are visible to anyone who has access to the document. Private notes are viewable only by the person who uploaded the document.

  • Can I make a private note public or vice-versa? Yes, click on the note you'd like to change the access level on, click the pencil icon, and change from private to public or collaborator as needed.

  • What is a page note, and how do I add one? Instead of highlighting a portion of a document, you can create a note that appears at the top of a page. To do this, follow the directions for creating a note and, rather than drawing a box on a page, click in between any two pages (or above the first page).

  • How can I format the text of notes to make words bold, italic, etc.? You can format text in notes by using some basic HTML codes. For example, to bold a phrase, precede it with a b tag and end it with a closing b tag.

  • How do I publish a note? Any public notes you create are visible (i.e., published) as soon as you set the document’s access level to “public.”

  • How do I link directly to a note from a website? Each public note has a specific URL that you can share. To find it, select the note. Then click the chain-link icon to the right of the small headline. In your browser, the URL will change to the note link. Copy that and use it on your website. When a reader clicks the link, they’ll be directed to the document with the note open. If you are seeking to embed a note on your website, open the document that contains the note, click "Share" from the right-hand menu, select "Share specific note" and finally select the note you'd like to embed.

Projects

  • What are projects? Projects are labels you can apply to groups of documents to organize them by topic or project. A document can live in more than one project.

  • How do I create a project? In the workspace at left, click the “New Project” button. Give your project a name and click “Save.”

  • Can I create sub-projects inside a project? Not at this time. However, if you are looking for a way to easily organize, filter or search your documents, we recommend you add custom data.

  • How do I add documents to a project? You can drag and drop the file icon on the project title at the left of the workspace. Or highlight the file in the workspace, click the “Projects” icon, and choose the name of a project.

  • How do I remove documents from a project? Select a document in the workspace. Click the “Projects” menu, which displays all your project titles. You’ll see a check mark next to each project the document belongs to. Find the project that you want to remove the document from, and click the title to remove the check mark.

  • How do I share a project with others? In the workspace, hover over the name of your project in the project list, then click the pencil icon to show the project editing dialog. Click “Add a collaborator to this project.” You can add email addresses of people who have DocumentCloud accounts — whether they belong to your organization or another.

  • Can I make all documents in a project public at once? At this time, no. You can edit the access level of up to 25 documents at a time in the workspace by selecting the blank box next to the "Edit" drop-down, selecting "Edit", "Change Access" and selecting the appropriate access level. If you have a large number of documents, please contact us by email at info@documentcloud.org to discuss other options.

Collaboration

  • How can I see documents others in my organization have uploaded? In the search bar, clear the search queries and type organization: and type the name of your organization. You can combine the query with access: to see what documents from your organization have public, private or organization on them.

  • How do I share documents with a collaborator? If you’d like to share specific documents with others, first add the documents to a project. Then click the pencil icon next to the project name in the sidebar, and then “Manage Collaborators.” Collaborators must have an existing DocumentCloud account that’s linked to that email and they must have logged in to DocumentCloud at least once in order to be added to a project.

Embedding and Sharing documents

  • What options do I have for embedding documents? We currently offer four embed types:

    • Document: A viewer that shows the complete document, including attribution, all notes, and an available sidebar with navigation and attribution.
    • Page: A lightweight, responsive single page that includes attribution and click-through to the full document.
    • Note: A single annotation that includes attribution and click-through to the full document.
    • Projects: A collection of documents organized by project.
  • How do I embed a document, page, notes or collection of documents on my website? For documents, pages, and notes Click on the document you are seeking to embed to open it in the viewer, on the right side-bar click "Share" and follow through with the share option that you desire. For projects, click on the pencil icon to the right of the project name, and click "Share/Embed Project".

  • How do I make a document visible to the public? In the workspace, select one or more documents. Under the “Edit” menu, choose “Change Access” Pick one of the three options. “Public Access” means anyone on the internet can search for and view the document. “Private Access” means only you and people with explicit permission (via collaboration) have access. “Private to your organization” means only the people in your organization have access. (No freelancers.)

  • How do I set a time for a document to become public? If your document is private, you can set a publication date for it:

    • Select one or more documents in the workspace. Click "Edit" -> "Change Access" -> "Schedule publication"
    • Choose the date and time for publication and press “Change” The date and time set will show in the workspace. If you change your mind, re-open the change access menu and de-select the "Schedule publication" and hit "Change".
  • How do I get the URL of a document I want to share? Make sure you have set the document’s access to “public.” Double-click a document to open it in the document workspace. The URL for sharing the document is now in your browser’s URL bar. The format is/documents/[ID Number]-[document-title-words].html

  • Will DocumentCloud work with my CMS? DocumentCloud is used by hundreds of organizations worldwide with many different content management systems. If your CMS offers the ability to embed snippets of JavaScript and HTML, you should be fine. We’re available to talk with your CMS’s developers to iron out any questions. Please contact us by email at info@documentcloud.org

  • Do embeds work on phones? Yes, all of our embed types can be viewed on phone or tablet screens. However, our page and note embeds are responsive and are the best choice for display on those devices.

  • How do I embed documents using WordPress? The best way is to install our custom WordPress plugin, which lets you embed by entering shortcodes into your text. See the documentation for details. In most installations, you can do this right from your site’s plugin section. Usually, you can just drop a DocumentCloud link right in line and it should embed the document. Depending on your configuration, you might need to use a shortcode formatted like this: [documentcloud url="https://www.documentcloud.org/documents/282753-lefler-thesis.html"]

  • How do I change the width or height of an embed on my site? Select the "Customize Appearance" drop-down menu when in the embed page.

  • Can readers make changes to documents I embed? No, only you or DocumentCloud users in your organization can make changes to your documents.

  • Can I prevent people from downloading my original document? If you embed a document with our full viewer, you can disable the link to the original PDF that appears in the sidebar by selecting "Customize Appearance" and changing "PDF Link" to "Hidden" (it is visible by default). Nevertheless, once you set your document to public access, it will appear in Internet search results, and people will be able to download it.

  • How do I make the sidebar show or hide in the document viewer? Click "Customize Appearance" and change "Sidebar behavior" to "hidden". When embedding the viewer at narrow widths, hiding the sidebar is usually a good idea.

Keyboard Shortcuts

From the Document view:

  • A: Start annotating a document.
  • R: Start redacting a document.
  • S: Add or edit page sections.
  • Ctrl/CMD+F: Start searching through page.
  • Esc: Cancel the current action.

DocumentCloud API

  • What is the DocumentCloud API? DocumentCloud’s API provides resources to search, upload, edit, and organize documents as well as to work with projects. In addition, an oEmbed service provides easy integration of embedding documents, pages and notes. Full documentation is available.

  • Do you need a DocumentCloud account to use the API? As with DocumentCloud’s workspace, you need an account to use the API to upload, update or delete documents, or create and modify projects. Other API functions, such as search, do not require an account. Consult the documentation for details.

  • What libraries are available for working with the API? We have provided an open-source Python wrapper for the DocumentCloud API, which is well documented.

  • Are there limits on API use? Yes, please see the rate limit documentation.