Connector for Box Version 2 Setup and Reference Guide

Similar documents
Connector for Box, Version 2 Setup and Reference Guide

Connector for Microsoft SharePoint 2013, 2016 and Online Setup and Reference Guide

Connector for OpenText Content Server Setup and Reference Guide

Coveo Platform 7.0. Yammer Connector Guide

Box Connector. Version 2.0. User Guide

Setting Up Resources in VMware Identity Manager (On Premises) Modified on 30 AUG 2017 VMware AirWatch 9.1.1

Setting Up Resources in VMware Identity Manager 3.1 (On Premises) Modified JUL 2018 VMware Identity Manager 3.1

Setting Up Resources in VMware Identity Manager (SaaS) Modified 15 SEP 2017 VMware Identity Manager

RSA SecurID Ready Implementation Guide. Last Modified: December 13, 2013

Single Sign-On for PCF. User's Guide

Setting Up Resources in VMware Identity Manager. VMware Identity Manager 2.8

Coveo Platform 7.0. Microsoft SharePoint Legacy Connector Guide

Setting Up Resources in VMware Identity Manager

Coveo Platform 6.5. Microsoft SharePoint Connector Guide

Connector for CMIS Setup and Reference Guide

Release Notes Release (December 4, 2017)... 4 Release (November 27, 2017)... 5 Release

Cloud Help for Community Managers...3. Release Notes System Requirements Administering Jive for Office... 6

Using the VMware vrealize Orchestrator Client

Talend Component tgoogledrive

Administering Jive Mobile Apps for ios and Android

<Partner Name> <Partner Product> RSA SECURID ACCESS Implementation Guide. PingIdentity PingFederate 8

Managing Load Plans in OTBI Enterprise for HCM Cloud Service

SAP IoT Application Enablement Best Practices Authorization Guide

CLIQ Web Manager. User Manual. The global leader in door opening solutions V 6.1

Centrify for Dropbox Deployment Guide

Info Input Express Network Edition

BMC FootPrints 12 Integration with Remote Support

Colligo Engage Outlook App 7.1. Connected Mode - User Guide

NIELSEN API PORTAL USER REGISTRATION GUIDE

Using the VMware vcenter Orchestrator Client. vrealize Orchestrator 5.5.1

Coveo Platform 7.0. Jive Connector Guide

8.0 Help for Community Managers Release Notes System Requirements Administering Jive for Office... 6

USER MANUAL. SalesPort Salesforce Customer Portal for WordPress (Lightning Mode) TABLE OF CONTENTS. Version: 3.1.0

Guide to Deploying VMware Workspace ONE. VMware Identity Manager VMware AirWatch 9.1

VMware Identity Manager Connector Installation and Configuration (Legacy Mode)

<Partner Name> <Partner Product> RSA SECURID ACCESS Implementation Guide. Pulse Connect Secure 8.x

Integrating AirWatch and VMware Identity Manager

9.0 Help for Community Managers About Jive for Google Docs...4. System Requirements & Best Practices... 5

October J. Polycom Cloud Services Portal

Guide to Deploying VMware Workspace ONE with VMware Identity Manager. SEP 2018 VMware Workspace ONE

IBM StoredIQ Administrator Version Administration Guide IBM SC

Hitachi ID Systems Inc Identity Manager 8.2.6

VMware Identity Manager vidm 2.7

Microsoft Office Groove Server Groove Manager. Domain Administrator s Guide

Coveo Platform 7.0. Oracle UCM Connector Guide

VMware AirWatch Content Gateway for Linux. VMware Workspace ONE UEM 1811 Unified Access Gateway

Guide to Deploying VMware Workspace ONE. DEC 2017 VMware AirWatch 9.2 VMware Identity Manager 3.1

Policy Manager in Compliance 360 Version 2018

Installing and Configuring VMware Identity Manager Connector (Windows) OCT 2018 VMware Identity Manager VMware Identity Manager 3.

REST API Operations. 8.0 Release. 12/1/2015 Version 8.0.0

Axcelerate 5.8 Default Fields in Axcelerate Review & Analysis

Secure Access Manager User Guide September 2017

Entrust. Discovery 2.4. Administration Guide. Document issue: 3.0. Date of issue: June 2014

Introduction to application management

RED IM Integration with Bomgar Privileged Access

VMware AirWatch Chrome OS Platform Guide Managing Chrome OS Devices with AirWatch

IMPLEMENTING SINGLE SIGN-ON (SSO) TO KERBEROS CONSTRAINED DELEGATION AND HEADER-BASED APPS. VMware Identity Manager.

13241 Woodland Park Road, Suite 400 Herndon, VA USA A U T H O R : E X O S T A R D ATE: M A R C H V E R S I O N : 3.

Perceptive TransForm E-Forms Manager

Tzunami Deployer Confluence Exporter Guide

Integrate HEAT Software with Bomgar Remote Support

HP Database and Middleware Automation

EUSurvey Installation Guide

HEAT Software Integration with Remote Support

Oracle Cloud Using the Microsoft Adapter. Release 17.3

Integrate Salesforce. EventTracker v8.x and above

Guide for Administrators. Updated November 12, Page 1 of 31

Mozy. Administrator Guide

Configuration Guide - Single-Sign On for OneDesk

0. Introduction On-demand. Manual Backups Full Backup Custom Backup Store Your Data Only Exclude Folders.

Bomgar PA Integration with ServiceNow

penelope case management software AUTHENTICATION GUIDE v4.4 and higher

MANAGING ANDROID DEVICES: VMWARE WORKSPACE ONE OPERATIONAL TUTORIAL VMware Workspace ONE

Workspace ONE Chrome OS Platform Guide. VMware Workspace ONE UEM 1811

Colligo Manager 5.4 SP3. User Guide

Managing System Administration Settings

Avaya Event Processor Release 2.2 Operations, Administration, and Maintenance Interface

CLI users are not listed on the Cisco Prime Collaboration User Management page.

Enterprise Vault.cloud CloudLink Google Account Synchronization Guide. CloudLink to 4.0.3

Notification Template Limitations. Bridge Limitations

Colligo Console. Administrator Guide

Administrator Manual. Last Updated: 15 March 2012 Manual Version:

Tzunami Deployer Confluence Exporter Guide

Using the vrealize Orchestrator Operations Client. vrealize Orchestrator 7.5

Administration Manual 1

VMware AirWatch Product Provisioning and Staging for Windows Rugged Guide Using Product Provisioning for managing Windows Rugged devices.

CollabNet Desktop - Microsoft Windows Edition

Quick Connection Guide

Workspace ONE UEM Integration with RSA PKI. VMware Workspace ONE UEM 1810

Setting Up the Server

About This Document 3. Overview 3. System Requirements 3. Installation & Setup 4

PassKey Manager Guide

Coveo Platform 7.0. Liferay Connector Guide

Lucid Key Server. Help Documentation.

Chatter Answers Implementation Guide

Release 3.0. Delegated Admin Application Guide

VMware Mirage Web Manager Guide

Perceptive Media Connector

Android Mobile Single Sign-On to VMware Workspace ONE. SEP 2018 VMware Workspace ONE VMware Identity Manager VMware Identity Manager 3.

Introduction... 5 Configuring Single Sign-On... 7 Prerequisites for Configuring Single Sign-On... 7 Installing Oracle HTTP Server...

Transcription:

Connector for Box Version 2 Setup and Reference Guide Published: 2018-Feb-23

Contents 1 Box Connector Introduction 5 1.1 Products 5 1.2 Supported Features 5 2 Box Connector Limitations 6 3 Box Connector Prerequisites 9 3.1 Metadata Field Creation 9 3.2 JDK version for OAuth 2.0 with JWT 9 4 Authentication Methods and Project Setup 10 4.1 Access via OAuth 2.0 with JWT 10 4.1.1 How to Use the Box Connector with OAuth 2.0 with JWT 10 4.1.2 Set up the Box Application 12 4.1.3 Authorize the Application for OAuth 2.0 with JWT 18 4.1.4 Set up the Connector for OAuth 2.0 with JWT 21 4.2 Token-based Authentication Methods and Setup 22 4.2.1 Create Box Application 22 4.2.2 Define Scopes 22 4.2.3 Request Settings 23 4.2.4 How to Use the Box Connector with Your Box Admin Credentials 23 4.2.4.1 Provide Credentials for Token Generation 25 4.2.5 How to Use the Box Connector with Manually Created Tokens 25 4.2.5.1 Provide Tokens 27 4.2.5.1.1 Proxy Setup and Network Access 28 4.2.5.1.2 Recommind Box Token Generator 28 5 Box Specific Metadata Supported by the Connector 33 5.1 Box Document Dates 41 5.2 Author Information 42 5.2.1 Box document authors 42 5.2.2 Box note authors 42 5.2.3 Box note author information in the CORE XML structure 43 5.3 Box URL Hints 43 2 Recommind, Inc. 2018.

6 Box Custom and Template Metadata 44 7 Create a Data Source 47 7.1 Start URI for Box Data Sources 47 8 Crawl a Specific User or Folder 48 9 Crawl Box Trash 50 10 Unknown Users Handling 51 11 Configure the Box Connector 52 11.1 Start URIs 52 11.2 Enable the Box connector 52 11.3 API authentication mode 53 11.4 Application Settings File for OAuth 2.0 with JWT 53 11.5 Generate Tokens 54 11.6 Client ID 54 11.7 Client secret 54 11.8 Box admin user name 55 11.9 Box admin password 55 11.10 Access token 55 11.11 Refresh token 55 11.12 Use proxy server 56 11.13 Proxy host 56 11.14 Proxy port 56 11.15 Use proxy authentication 57 11.16 Proxy user name 57 11.17 Proxy user password 57 11.18 Working directory 58 11.19 Box Note Authors 58 11.20 Index versions 58 11.21 Index trash 59 11.22 Box users to be crawled, specified by login name (email address) 59 11.23 Box folders to crawl for this user 59 3 Recommind, Inc. 2018.

12 Changes to this Document 60 13 Contact Us 61 14 Terms of Use 62 4 Recommind, Inc. 2018.

1 Box Connector Introduction 1 Box Connector Introduction This document describes how to configure and use the connector for the cloud storage service Box.com. 1.1 Products This document applies to the following Recommind products and versions: Axcelerate 5.13 and up (for Axcelerate 5.13.1, a patch may be required) Axcelerate 5.8 up to Axcelerate 5.12 (without OAuth 2.0 with JWT access) 1.2 Supported Features Feature Support Full Crawl Incremental Crawl (with modification date and checksum) Y N Tip: You can run a new full crawl of the Box data each time and use the system to deduplicate, so overlapping data is not published. Folder Security (ACL) Versions NAS Support Exception Handling N Y (with limitations to metadata) N Y The connector supports crawling of Box files, versions, notes, comments, tasks and bookmarks. 5 Recommind, Inc. 2018.

2 Box Connector Limitations 2 Box Connector Limitations Restricted Box trash extraction Due to Box API limitations, retrieving content from the trash is restricted. Box files, notes or bookmarks can be retrieved from the trash. However, any comments or tasks that have been added to a Box object cannot be retrieved. You cannot specify trashed folders in the list of the folders to crawl. Restricted metadata for Box versions If the box connector is configured to index versions, certain metadata values are extracted for current files, but will be missing for versions. This includes: box_description Impact: You cannot search, display, export, etc., the text in the description field rm_foldername box_trashed_at Impact: You cannot search, filter, display, export, etc. the date value of when the previous version was trashed. rm_creationdate Impact: You cannot search, filter, display, export, etc. the date value. rm_lastmodifieddate Impact: You cannot search, filter, display, export, etc. the date value. box_created_by_id, box_created_by_login, box_created_by_name, etc. Impact: You cannot search, filter, display, export, etc. the created-by values. We do however get the modified-by values for versions. box_owned_by_id, box_owned_by_login, box_owned_by_name, etc. Impact: You cannot search, filter, display, export, etc. the owned-by values. We do however get the modified-by values for versions. box_shared_link_url Impact: You cannot search, display, export, etc. the text in the Shared Link field box_tag Impact: You cannot search, filter, display, export, etc. the tags users have created on a Box file. Box template & custom metadata Impact: You cannot search, filter, display, export, etc. the Box template or custom metadata key-value pairs. 6 Recommind, Inc. 2018.

2 Box Connector Limitations No Box comments and tasks for Box versions Due to Box API limitations, comments and tasks are only extracted for the current file. Creation and modification dates for Box versions Due to Box API limitations, file system generated dates are only maintained for the current file, but not for versions. In addition, depending on how the file is uploaded (web page, mobile device, etc.) these dates may not be preserved. However, the date of the upload and of the last change on the Box server is available for all versions. Box note formatting Due to Box API limitations, any rich text formatting in a Box note is lost when indexed. The content of the text will be converted to plain text. When images are embedded into a Box note, these images are stored in a separate Box folder at the same level as the Box note itself. The folder is Box Notes Images\<Name of box note> Images. During data load, a note and its embedded image are indexed as separate documents. More Box note versions shown than in Box When working on Box notes, a version of the document is automatically saved by Box every 30 seconds. All of these versions are extracted by the connector. The Box UI in contrast limits the versions so that only every 5 minutes a version is shown. No Box comments modification dates For Box comments, no modification date is given even though Box comments can be changed via the Box API. They cannot be changed with the Box UI however. Box Tasks on Box Notes For Box notes, no Box tasks are extracted. It is possible to add such Box tasks with the Box API, but not with the Box UI. No relationship indicated for comments and tasks on archives Box comments and tasks may be attached to archive files (e.g. ZIP files). Depending on the data source configuration, archives are not indexed. However, the comments and tasks that would be children to these archives are indexed, but will not have a relationship to their parent because the parent itself is not indexed. If the data source is configured to index archive files, comments and tasks will be shown as attachments after data load. Box custom metadata The indexed property names for custom metadata keys depend on the display name that is used when first creating this key. Due to Box API limitations, later changes to the name from the Box administration page will not be reflected by the property name when indexed. 7 Recommind, Inc. 2018.

2 Box Connector Limitations Duplicate ID Recomputation Re-computing (rehashing) the duplicate IDs for already indexed Box documents is not supported. Please note that this limitation only affects the re-computation (rehashing) of duplicate IDs based on already indexed data via the siblings.bat script. Consistent duplicate ID computation during re-crawls is supported for Box documents. Duplicate ID computation for trash Due to Box API limitations, when identical objects are in the Box trash and some of these objects have comments or tasks while others do not, these will be considered duplicates by the connector. Related: "Index versions" on page 58 "Index trash" on page 59 8 Recommind, Inc. 2018.

3 Box Connector Prerequisites 3 Box Connector Prerequisites 3.1 Metadata Field Creation Before Box data is crawled, you must create a Box template project that contains the Box metadata fields that will be captured. You will then use this template when you create new Axcelerate Ingestion and Axcelerate Review & Analysis applications. This will ensure the Box fields are transferred to the applications and that the primary data source is an actual Box data source. 3.2 JDK version for OAuth 2.0 with JWT If you use Axcelerate 5.14 or 5.13, the default RSA key pair that is provided by Box needs to be converted into a key pair that matches the restricted security capabilities of JDK version prior to JDK 8u162. Recommind Support can help with this process. Related: "Client secret" on page 54 "Working directory" on page 58 9 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4 Authentication Methods and Project Setup The connector connects to the Box application using OAuth 2.0 authentication. Before configuring the actual connector, decide which authentication method to use. The authentication method has an effect on the data source configuration. "How to Use the Box Connector with OAuth 2.0 with JWT" below This is the most convenient and recommended method. "How to Use the Box Connector with Your Box Admin Credentials" on page 23 This method is deprecated as of Axcelerate 5.13. It uses automatic token generation based on Box admin credentials. "How to Use the Box Connector with Manually Created Tokens" on page 25 This method is deprecated as of Axcelerate 5.13. This method uses Recommind Token Generation. It can be used instead of the previous method that uses Box admin credentials if SSO is required. 4.1 Access via OAuth 2.0 with JWT This API authentication method is recommended as it is the most convenient one. It needs a minimum of configuration settings in CORE Administration. Related: "Box Connector Prerequisites" on the previous page 4.1.1 How to Use the Box Connector with OAuth 2.0 with JWT The chart below outlines the required application and data source setup to crawl data using the Box connector when directly connected to a Box application, without using a user account for authentication. This is the recommended setup. The advantages of this method: You can use the same configuration settings for multiple Box data sources. You do not have to expose Box admin credentials in the data source configuration. You do not have to create tokens for each data source when using SSO. Since no tokens are used in the configuration, expiring tokens are not an issue. Note: If you use Axcelerate 5.14 or 5.13, carefully read the Prerequisites. 10 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Step Task Additional Information 1 Create a Box template project and add the Box-specific metadata fields that you will capture as part of your data crawls. If you are a LaunchPad user, you will use this template project each time you create new applications that require Box configuration within the same pod. If you are an On Premise user, you will use this template project each time you create new applications that require a Box configuration. "Box Specific Metadata Supported by the Connector" on page 33 Note: Contact Recommind Support if assistance is needed to ensure proper template setup. 2 Create a new project using your Box template project; this transfers the Box fields to the new Axcelerate Ingestion and Axcelerate Review & Analysis applications. 3 Create a data source using the template data source in the project. 4 Create a Box application and an application settings file that is referenced in the connector configuration. 5 In the data source configuration, check all settings. Especially, in the Scope Settings node, input the Box accounts to be crawled and optionally enable Index trash. LaunchPad users: Clone from a Template Application On Premise users: Template Usage for CORE Configuration Settings. "Create a Data Source" on page 47 "Access via OAuth 2.0 with JWT" on the previous page "Crawl a Specific User or Folder" on page 48 "Crawl Box Trash" on page 50 6 When the data source is correctly configured, start it. 11 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4.1.2 Set up the Box Application You need to set up a Box application to represent the permissions for the connector. To create a new application: 1. Log in to Box as administrator and move to Dev Console. 2. In the Developer Console, click Create New App. 3. As type of application, select Custom App and click Next. 12 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4. As authentication method choose OAuth 2.0 with JWT (Server Authentication) and click Next. 5. Give the application a name of your choice and click Create App. If you get en error here, the name may already exist. Then use another name. 13 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup The application is created. 6. To open the application configuration page, click View Your App. 7. On the configuration page, note the entry Client Id. This ID is also known as Application ID or API ID and is needed for application authorization. 14 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 8. Scroll down to define the required permissions of the application. Do these settings: Application Access Set to Enterprise. Application Scopes Select Read all files and folders stored in Box, Read and write all files and folders stored in Box and Manage users. Advanced Features Select Perform Actions as Users 9. Click Save Changes. 10. For indexing trash content, an additional permission Global Content Manager (GCM) is needed. This permission is not currently available in the UI, but instead has to be enabled by Box support. To request this, submit a Box API Case. 15 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Note: If this step is completed at a later time, the step "Authorize the Application" has to be repeated to reflect the additional permission. 11. Scroll down and click Generate a Public/Private Keypair. This generates an RSA key pair that is used in the communication between the connector and the Box servers. 16 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup This starts a download in the browser. The downloaded file is the Box Application Settings File that must be referenced in the connector configuration. It contains the only copy of the private key, whereas the public key is stored on the server. Important: Ensure that this file is stored securely, since it later allows access to the data of all users within the Box enterprise account. Result: The configuration page should now indicate the ID of the public key that has just been created. 17 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4.1.3 Authorize the Application for OAuth 2.0 with JWT The application now knows which permissions it needs. To actually grant these permissions for a Box enterprise account, the application has to be authorized. 18 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 1. Move to Admin Console. 2. In the Admin Console, from the settings menu, select Enterprise Settings (which may also be called Elite Settings or something similar). 3. Move to the Apps tab. 4. In the Custom Applications area, click Authorize New App. If the application already exists, the action "Reauthorize app" for the existing entry can be used instead. 19 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 5. Enter the API key and click Next. The API key is also known as Application ID or Client ID. To know how to obtain the key, see "Set up the Box Application" on page 12. 20 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 6. Check that the displayed permissions are correct and click Authorize. The item Admin or co-admin can make calls for any content in their enterprise proves that the Global Content Manager permission is enabled. The application is now granted access to the relevant data within the Box enterprise account. 4.1.4 Set up the Connector for OAuth 2.0 with JWT The application settings file now needs to be provided to the connector. Tip: You can reuse the file for additional Box data sources. 1. In CORE Administration, in the data source configuration, move to Crawler Connectors > DMS Connectors > Box > Connection Settings. 2. In the API authentication mode field, select OAuth 2.0 with JWT. 3. In the Application settings file field, browse to the application settings file that you created (see "Set up the Box Application" on page 12). 21 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Note: Make sure that the respective crawler host has access to this file. 4.2 Token-based Authentication Methods and Setup Before you can configure the Recommind connector, complete the following tasks. 4.2.1 Create Box Application Create a Box application that provides programmatic access to the Box Enterprise account. To create and configure such an application, log in as Box administrator and go to https://app.box.com/developers/services. 4.2.2 Define Scopes The scope required by the connector are: Read and write all files and folders stored in Box, which is enabled by default. Manage Users, which needs to be manually enabled. 22 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4.2.3 Request Settings There are some additional settings that cannot be satisfied by configuration changes, but must be provided by Box support. Send your requests to api@box.com. File a request with Box to activate the As-User feature, which enables the Box application to access any Box Enterprise user account. The connector uses this access. If data is to be retrieved from trash: File a request with Box to activate the Global Content Manager (GCM) scope. Note: This setting must be provided by Box before the Token Generation is executed. So in case a token pair has already been created, clear the working directory configured for the connector (and if applicable the tokens specified in the data source configuration) after Box provides the changes. 4.2.4 How to Use the Box Connector with Your Box Admin Credentials Important: If you are using SSO, see this topic: "How to Use the Box Connector with Manually Created Tokens" on page 25 The chart below outlines the required CORE application and data source setup to crawl data using the Box connector with your Box admin credentials. 23 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Step Task Additional Information 1 Create a Box template project and add the Box-specific metadata fields that you will capture as part of your data crawls. If you are a LaunchPad user, you will use this template project each time you create new applications that require Box configuration within the same pod. If you are an On Premise user, you will use this template project each time you create new applications that require a Box configuration. "Box Specific Metadata Supported by the Connector" on page 33 Note: Contact Recommind Support if assistance is needed to ensure proper template setup. 2 Create a new project using your Box template project; this transfers the Box fields to the new Axcelerate Ingestion and Axcelerate Review & Analysis applications. 3 Create a data source using the template data source in the project. LaunchPad users: Clone from a Template Application On Premise users: Template Usage for CORE Configuration Settings. "Create a Data Source" on page 47 4 Provide credentials for automatic token generation. "Provide Credentials for Token Generation" on the next page 5 In the data source configuration, check all settings. Especially, in the Scope Settings node, input the Box accounts to be crawled and optionally enable Index trash. "Crawl a Specific User or Folder" on page 48 "Crawl Box Trash" on page 50 6 When the data source is correctly configured, start it. 24 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup 4.2.4.1 Provide Credentials for Token Generation If SSO required is not enabled on the Box admin account, specify the login name of a Box Primary Admin and password in the data source configuration. Using these credentials, the connector automatically generates an initial token pair. Whenever needed, a new token pair is created with the most recent refresh token. Providing credentials is more convenient than providing tokens. 4.2.5 How to Use the Box Connector with Manually Created Tokens The chart below outlines the required CORE application and data source setup to crawl data using the Box connector with your Box admin credentials. Step Task Additional Information 1 Create a Box template project and add the Box-specific metadata fields that you will capture as part of your data crawls. If you are a LaunchPad user, you will use this template project each time you create new applications that require Box configuration within the same pod. If you are an On Premise user, you will use this template project each time you create new applications that require a Box configuration. "Box Specific Metadata Supported by the Connector" on page 33 Note: Contact Recommind Support if assistance is needed to ensure proper template setup. 25 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Step Task Additional Information 2 Create a new project using your Box template project; this transfers the Box fields to the new Axcelerate Ingestion and Axcelerate Review & Analysis applications. 3 Create a data source using the template data source in the project. 4 Run the token generation process and copy the token credentials into the data source configuration. Note: You must run the token generation process each time you create a new project and configure the initial data source for use with the Box connector, and also for each additional data source. LaunchPad users: Clone from a Template Application On Premise users: Template Usage for CORE Configuration Settings. "Create a Data Source" on page 47 "Authentication Methods and Project Setup" on page 10 "Provide Tokens" on the next page 26 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Step Task Additional Information 5 In the data source configuration, check all settings. Especially, in the Scope Settings node, input the Box accounts to be crawled and optionally enable Index trash. Important: Each time you crawl data, use the same configured Box data source. You will not need to repeat the initial token generation process for subsequent crawls so long as the token does not expire. "Crawl a Specific User or Folder" on page 48 "Crawl Box Trash" on page 50 Note: If you create a new data source and use an existing data source with an active token as a template, the system will generate a Refresh token has expired error in the CORE Log Viewer. This does not mean the token expired for the original data source, just that it cannot be reused. 6 When the data source is correctly configured, start it. Note: So long as the data source is used to crawl data at least once every 60 days, the token will not become inactive. If the token does become inactive, repeat the token generation process and copy the new information into the data source configuration. 4.2.5.1 Provide Tokens If SSO required is enabled on the Box admin account, or if the admin credentials shall not be directly exposed due to security policy, you must provide a token pair consisting of an access token and a refresh token. Caution: Only use an account with Box required rights to create the tokens. The tokens will infer the rights from the account they have been created with. In case of insufficient rights, you will see a 403 error in the crawler log files when the generated tokens are being used by the connector. 27 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Note: If SSO is required, tokens cannot be created automatically, using admin credentials. This is due to the login process being different for each different Identity Provider (IdP). To provide tokens, create a token pair with the Recommind Box Token Generator and copy it to the data source configuration. The provided token pair is only needed for the initial data source setup. For the following runs (with the same connector working directory), the tokens will automatically be replaced with new pairs that are created with the latest refresh token. Only after 60 days of inactivity the latest refresh token will expire. You then have to provide a new valid token pair generated with the Recommind Box Token Generator. Note: A token pair can only be used by a single connector. Each data source needs its own dedicated token pair. Related: "Recommind Box Token Generator" below "Generate Tokens" on page 54 "Access token" on page 55 "Refresh token" on page 55 4.2.5.1.1 Proxy Setup and Network Access For the proper functioning of the Recommind Box Token Generator, network access is required. This has to be taken into account when working in a network restricted environment. The token generator itself needs access to *.box.com. If a proxy is configured for the Recommind Box Token Generator, the proxy needs access to *.box.com The web browser needs access to the token generator and to *.box.com. If SSO is used, it also needs access to the server of the Identity Provider (IdP). These are the same requirements that apply to logging in to Box. 4.2.5.1.2 Recommind Box Token Generator The Recommind Box Token Generator is a small standalone tool that facilitates the initial generation of a token pair. It starts a web server that serves a page guiding the user through token pair generation. 28 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Configure Box Redirect URL for Token Generation For the Recommind Box Token Generator to work properly, configure the correct redirect URL in the Box application. The redirect URL must point to the Token Generator s main page. By default, this is http://localhost:5858/boxtokengenerator. When running the generator on a different machine, change the redirect URL to point to that machine instead of localhost. Also make sure that the port matches that of the Recommind Box Token Generator s configuration. To change the relevant Box configuration parameter: 1. Log in as Box admin. 2. Go to https://app.box.com/developers/services and click Edit Application for the respective application. 3. Under OAuth2 Parameters, adapt the redirect_uri parameter. Install and Run the Recommind Box Token Generator Required: On Premise clients: download the connectors\recommind-box-token-generation.zip from Recommind s FTP server, or via the link provided by Recommind Support. Save the file to an accessible location on your Admin server. Cloud clients: the Recommind Box Token Generator zip file was installed on your Admin server during pod setup. If you cannot locate the file, contact Recommind Support. Java Runtime Environment (JRE) with Java8 is installed on the machine the tool shall run on. The path to java.exe is set in the PATH system variable. To install and run the Recommind Box Token Generator: 1. Extract the zip file to an accessible folder on the Admin server. 2. Open the application.properties file; edit and save the configuration. Enter the clientid and the clientsecret for your Box environment. If you do not know where to locate this information, ask your Box Administrator. If you use a proxy server to access your Box servers, also populate the proxyhost and proxyport. application.properties file 29 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Configuration Property Details server.port Mandatory setting. Default: 5858 Sets the port under which you can reach the tool with a browser. Use the same port for the redirect URL configured for the Box application. See: "Configure Box Redirect URL for Token Generation" on the previous page clientid This mandatory value indicates the Box application Client ID. It must match the Client ID in the data source configuration. clientsecret This mandatory value indicates the Box Application secret. It must match the Client secret in the data source configuration. proxyhost This optional value indicates the host of the proxy to use when accessing the Box servers internally. When no value is given here, no proxy is used for this communication. proxyport Port of the proxy to use. It is mandatory when a proxy is used. proxyuser User name to use when authenticating at the proxy. It is only needed when a proxy is configured and the proxy requires authentication. If left empty, no authentication is used for the proxy. proxypassword Password to use when authenticating at the proxy. This value is mandatory, when a proxy with authentication is used. 3. To apply the changed properties, start the Recommind Box Token Generator by clicking box-token-generator.bat. To restart it, close the command prompt for the BAT file with [CTRL]+[C] and then click box-token-generator.bat again. Optionally, check the log output in recommind-box-token-generator.log. 30 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Note: Running the.bat file runs a quick command line, which closes on its own..bat command line 4. Open the URL http://localhost:5858/boxtokengenerator with a web browser. Important: If you specified another server or port in the application.properties file, use the URL with those settings. See: "Configure Box Redirect URL for Token Generation" on page 29 If the setup was successful, the Recommind Box Token Generator s main page with the configured settings is shown. Generate Tokens and Copy the Token Credentials to the Data Source Configuration Required: You are not logged in to Box if you are using SSO, or you are logged in with the admin account for which tokens will be generated. 1. After you have started the Recommind Box Token Generator, the Token Generator s main page with the configured settings is shown. Verify the correct configuration values and redirect URL are shown. 31 Recommind, Inc. 2018.

4 Authentication Methods and Project Setup Recommind Box Token Generator configuration settings 2. Click the Generate Tokens link. If you are not logged in to Box, you will be redirected to a Box login page. 3. Log in with your Box admin account either directly or by using SSO. You are redirected to a Box page asking you to grant permissions to the Box App indicated with Client ID in the Box Token Generator configuration. 4. Click Grant access to Box. You are redirected to the Box Token Generator web page. There is an additional section labeled Resulting Tokens that shows the token pair that has been generated. Tip: Leave this screen open as you will copy information from it for input into the data source configuration settings. 5. Open the workspace in CORE Administration containing the application which holds the data source you configured for use with the Box connector. Open the data source configuration. In the Crawler Connectors > DMS Connectors > Box > Connection Settings area, copy the Access token and Refresh token credentials into the corresponding fields of the connector configuration. Tip: To generate another pair, click Generate Tokens again. To return to the initial page, click Reset Page. Note: Do not use the browser's back button and do not reload the page. This may lead to errors because provided URL parameters are valid only once. Instead use the Reset Page link (if available) or enter the initial URL again. 32 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Supported metadata are extracted from Box items and added to the indexed document as XML tags. To make metadata visible to users, or to make them searchable, create a new field in the document model of the respective Recommind application. Property name box_bookmark_url 5 Box Specific Metadata Supported by the Connector box_can_non_owners_invite box_comment_count box_comment_is_ reply box_created_at box_created_by_id box_created_by_login box_created_by_ name box_created_by_role box_created_by_language box_created_by_ timezone box_created_by_ space_amount box_created_by_ space_used box_created_by_ max_upload_size URL of a bookmark Description of this property in Box Information whether or not the non-owners can invite collaborators to the folder. For documents that can have comments attached. Only meaningful for comments. Contains the -data creation date supplied by Box. The user that has created/uploaded the document. see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id 33 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_created_by_ status box_created_by_job_ title box_created_by_ phone box_created_by_ address box_created_by_ avatar_url box_created_by_ enterprise box_created_by_is_ sync_enabled box_created_by_is_ external_collab_ restricted box_created_by_can_ see_managed_users box_created_by_is_ exempt_from_device_ limits box_created_by_is_ exempt_from_login_ verification box_created_by_is_ password_reset_ required box_created_by_is_ platform_access_only Description of this property in Box see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id 34 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_created_by_my_ tags box_created_by_ email_alias box_description box_document_type box_file_id box_has_versions box_created_by_hostname box_has_collaborations box_metadata_<template>_<key> box_metadata_properties_<key> box_modified_at box_modified_by_id box_modified_by_ login box_modified_by_ name box_modified_by_role box_modified_by_language box_modified_by_ timezone Description of this property in Box see explanation for box_created_by_id see explanation for box_created_by_id see explanation for box_created_by_id The description as given by Box. Possible values: comment, file, file_version, folder, task File ID. Common ID between all file versions. Indicates whether there are collaborations for the item. Set for all files and versions. No means that no older versions exist. Metadata key/value pair. Custom key/value pairs have properties in their name. Contains the -data modification date supplied by Box. The user that has last modified the document. see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id 35 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_modified_by_ space_amount box_modified_by_ space_used box_modified_by_ max_upload_size box_modified_by_ status box_modified_by_ job_title box_modified_by_ phone box_modified_by_ address box_modified_by_ avatar_url box_modified_by_ enterprise box_modified_by_is_ sync_enabled box_modified_by_is_ external_collab_ restricted box_modified_by_ can_see_managed_ users box_modified_by_is_ exempt_from_device_ limits Description of this property in Box see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id 36 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_modified_by_is_ exempt_from_login_ verification box_modified_by_is_ password_reset_ required box_modified_by_is_ platform_access_only box_modified_by_ hostname box_modified_by_ my_tags box_modified_by_ email_alias box_name box_note_author box_note_author_id box_note_author_ login box_note_author_ name Description of this property in Box see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id see explanation for box_modified_by_id Name, e.g. name of a bookmark. Indicates an author for a Box note. There may be multiple authors and each one is described with the properties box_ note_author_id, box_note_author_login and box_note_author_name. Example: <box_note_author type="main"> <box_ note_author_id>65656565</box_note_ author_id> <box_note_author_ login>smith@recommind.com</box_note_ author_login> <box_note_author_ name>asmith</box_note_author_name> </box_note_author> The ID of this Box note author. The login name of this Box note author. The name of this Box note author. 37 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_owned_by_id box_owned_by_login box_owned_by_name box_owned_by_role box_owned_by_ timezone box_owned_by_ space_amount box_owned_by_ space_amount box_owned_by_max_ upload_size box_owned_by_ status box_owned_by_job_ title box_owned_by_ phone box_owned_by_ address box_owned_by_ avatar_url box_owned_by_language box_owned_by_enterprise box_owned_by_is_ sync_enabled Description of this property in Box The user that has the role OWNER of a document. Can be different from the creator. see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id 38 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_owned_by_is_ external_collab_ restricted box_owned_by_can_ see_managed_users box_owned_by_is_ exempt_from_device_ limits box_owned_by_is_ exempt_from_login_ verification box_owned_by_is_ password_reset_ required box_owned_by_is_ platform_access_only box_owned_by_hostname box_owned_by_my_ tags box_owned_by_ email_alias box_shared_link_url box_status box_tag box_task_action box_task_due_at Description of this property in Box see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id see explanation for box_owned_by_id URL of a shared link Can be either 'active', 'trashed' or 'deleted'. 'trashed' indicates that an item can be restored. 'deleted' items cannot be (directly) restored, since they are, e.g., in a deleted folder. Contains custom tags assigned to a document in Box. Each value reflects one tag. The action that the assignee of the task shall perform. At the moment this can only be REVIEW. The due date of the task, if a due date has been set. 39 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_task_assigned_at box_task_assigned_ by_id box_task_assigned_ by_login box_task_assigned_ by_name box_task_assigned_ to_id box_task_assigned_ to_login box_task_assigned_ to_name box_task_completed_ at box_task_is_completed box_task_message box_task_reminded_ at box_task_resolution_ state box_trashed_at box_upload_email Description of this property in Box For each task assignment this is the date indicating when this task was assigned. Currently not filled by Box. For each task assignment this is the ID of the user that assigned this task. Currently not filled by Box. For each task assignment this is the log name of the user that assigned this task. Currently not filled by Box. For each task assignment this is the name of the user that assigned this task. Currently not filled by Box. For each task assignment this is the ID of the user the task is assigned to. For each task assignment this is the login name of the user the task is assigned to. For each task assignment this is the name of the user the task is assigned to. For each task assignment this is the date at which the assignment is to be completed at. Currently not filled by Box. Indicates whether this task is completed (true) or not (false). For each task assignment this is the message that will is included with the assignment of the task. Currently not filled by Box. For each task assignment this is the date at which the assignee shall be reminded about this task. Currently not filled by Box. For each task assignment this is the resolution state of the task assignment. Values can be COMPLETED, INCOMPLETE, APPROVED or REJECTED. Currently not filled by Box. Date at which the document was trashed. Email address that uploads for a folder can be sent to. 40 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector Property name box_upload_email_ access box_url_hint box_version_number Description of this property in Box Access level for an upload email address. Can be COLLABORATORS or OPEN. URL to track back an item in the Box front-end if logged on with the user found in the URI. Version number of a file. When making changes this number is incremented automatically. 5.1 Box Document Dates For the individual documents, Box provides different dates. These date properties may occur in XML metadata after data load: box_created_at Indicates the time when the document (or version) was uploaded to the Box server. box_modified_at Indicates the time that the document was last changed on the Box server. This can be an upload or a direct modification on the server (e.g. in case of Box notes). rm_creationdate Indicates the creation of the content. Is derived from the file properties during upload. Usually reflects the time that the file was originally created. Note: This type of date is not available for Box versions. rm_lastmodifieddate Indicates the last modification date of the content. Is derived from the file properties during upload and usually reflects the time that the file was last changed before upload. This date is overwritten when the document is modified directly on the Box server (e.g. for Box notes). Note: This type of date is not available for Box versions. Note: The dates derived from the file properties may not be available, depending on the method of upload. More details on how Box handles dates can be found here: https://box-content.readme.io/docs/content-times 41 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector https://community.box.com/t5/managing-your-content/understanding-box- File-Timestamps/ta-p/339 5.2 Author Information Author information is provided by Box. Authors are the users that created a document in Box or uploaded a document to Box. For Box notes, a special mapping is used. 5.2.1 Box document authors The author for uploaded documents is the user that performed the upload. For instance, if John Smith wrote a Microsoft Word document, and Ann Miller uploaded it to Box, Ann Miller is shown in the Sender/Author Smart Filter. You find Box document creator information in the box_created_by_name property, which is mapped to the CORE rm_author field. Versions inherit the author from the current document. Note: If Box does not provide a creator for a document, the rm_author field remains empty. This may happen when a user is deleted and his data is transferred to another account. In this case, the transferred documents will not have an author, as the creator has been deleted. 5.2.2 Box note authors Box note author information is also mapped to the CORE rm_author field. Box notes may have several authors, but only one is shown in the rm_author field. Box notes author information has up to three properties: box_note_author_name, box_note_author_id, box_note_author_login. If the default configuration is not changed, these are concatenated to form one value in the CORE rm_author field. 42 Recommind, Inc. 2018.

5 Box Specific Metadata Supported by the Connector 5.2.3 Box note author information in the CORE XML structure This author information for a Box note: <box_note_author type="main"> <box_note_author_id>6437318</box_note_author_id> <box_note_author_login>miller@recommind.com</box_ note_author_login> <box_note_author_name>mandy_miller</box_note_ author_name> </box_note_author> is concatenated and mapped to the rm_author field like this: <rm_author type="main">mandy_miller (miller@recommind.com, 6437318)</rm_author> Related: "Box Note Authors" on page 58 "Unknown Users Handling" on page 51 5.3 Box URL Hints The CORE XML tag box_url_hint is filled with a Box URL hint. You can use this URL to directly find and access the corresponding document from a web browser on Box.com, for analyzing or troubleshooting. The URL does not contain user information. To make the Box URL hint work correctly, you must log in to Box with the correct account. The account belonging to the URL can be derived from the Location Smart Filter. 43 Recommind, Inc. 2018.

6 Box Custom and Template Metadata 6 Box Custom and Template Metadata In addition to default Box metadata, the connector supports metadata added by users. Box allows a user to add key-value pairs to certain Box documents, as custom metadata. An administrator can create metadata templates that each define a set of keys. A user can assign such a template to a document and then assign values to the individual keys belonging to the template. For each key the connector creates a corresponding output property in the XML file, with the corresponding value or no value if none has been set. The name of the property follows one of these patterns: box_metadata_<templatename> <key> <templatename> is the name of the template. <key> is the name of the key. For template metadata key names, some operations affecting the string are carried out by Box: e.g., characters are changed to lowercase, non-alphanumeric characters are dropped and so on. Note: The name for a template metadata key depends on the display name that is used when first creating this key in the template. Later changes to the name will not be reflected upon indexing. box_metadata_properties <key> properties indicates that this is a custom property. <key> is the name of the key. 44 Recommind, Inc. 2018.

6 Box Custom and Template Metadata Example: Metadata as set in Box and in the XML file after data load Assume these user-added metadata: Metadata set in Box In the XML file, these metadata look like this: <box_metadata_exampletemplate exampletext type="main">example Text Value</box_metadata_ exampletemplate exampletext> <box_metadata_exampletemplate examplenumber type="main">21.0</box_metadata_exampletemplate examplenumber> <box_metadata_exampletemplate exampledate type="main">2016-01-01t00:00:00.000z</box_metadata_ exampletemplate exampledate> <box_metadata_exampletemplate exampledropdown 45 Recommind, Inc. 2018.

6 Box Custom and Template Metadata type="main">option One</box_metadata_exampleTemplate exampledropdown> <box_metadata_properties Example_0x20_custom_0x20_ metadata_0x20_field type="main">example custom metadata value </box_metadata_properties Example_ 0x20_custom_0x20_metadata_0x20_field> 46 Recommind, Inc. 2018.

7 Create a Data Source 7 Create a Data Source To create a data source with the Box connector enabled, 1. In the Axcelerate Ingestion module, on the Data Sources tab, click Add new data source. 2. In the Define template type for data source step, select System template and click Next. 3. In the Define the data type of the data source step, select Document Management and click Next. 4. In the Template selection step, select Box. 5. Follow the wizard. Tip: The settings that follow can be changed after data source creation. 6. In the last step, do not choose Start immediately, but Finish. You need to enter at least some information that allows to connect to Box. 7.1 Start URI for Box Data Sources Start URIs for the Box connector use the scheme box. A valid URL is, for example: box:anything. Multiple Box start URIs are not supported. Related: "Start URIs" on page 52 47 Recommind, Inc. 2018.

8 Crawl a Specific User or Folder 8 Crawl a Specific User or Folder In the case that only a part of the Box enterprise account shall be crawled, you can configure the connector to crawl only certain folders in certain accounts. Under Box Scope Settings you can add the login names (email addresses) of the Box accounts to include in the crawl. If the list is empty, all accounts are crawled. As soon as there is an entry, only the listed accounts are crawled. Important: If you are using the same data source to crawl data for new custodians, consider manually changing the data source name in the crawler settings, in the URI-based annotation/uri patterns section. List of user accounts (email addresses) to include in the crawl For each account that you add to the list, a new configuration node named like the account is added to the tree. Open it to add the folders to crawl for that account to the List of Box folders to be included. When specifying folders, the All Files folder shown in the Box UI must not be included. Instead the entries have to start with a slash /. A folder deeper in the folder hierarchy can be given by using / as delimiter between the folders in the path. A trailing / will be ignored. 48 Recommind, Inc. 2018.

8 Crawl a Specific User or Folder List of folders to be included If the list is empty, all folders of the account are crawled. Note: If the Index folders check box is selected, the root folder is always indexed. Related: "Box users to be crawled, specified by login name (email address)" on page 59 "Box folders to crawl for this user" on page 59 49 Recommind, Inc. 2018.