Processing administration

Note: RelativityOne is currently upgrading the Processing infrastructure. The content on this page reflects the latest framework. If you see Processing Administration > Processing and Imaging Queue, you should continue with the topic below. If your tabs are Processing Administration > Worker Monitoring, see the Processing administration legacy topic.

The Processing Administration tab provides a centralized location for you to access data on active processing and imaging jobs throughout your Relativity environment, as well as the status of all workers assigned to perform those jobs. You can also use the Processing History sub-tab to identify all actions taken related to processing in your environment. Active jobs are shown on the Processing and Imaging Queue tab.

Processing and Imaging Queue

This page contains the following information:

Security considerations for processing administration

Consider the following items related to security and client domains (formerly multi-tenancy):

  • If you're the system admin for a client domain environment, Relativity makes it so that your tenants can only see jobs in their client domain. This eliminates the possibility of information leaks for workers that don't actually operate within your client domain.
  • In client domain environments, users from one client domain can't see any workers from other client domains.
  • In client domain environments, users from one client domain can only see work from their workspace. All other threads show an Item secured value for the Workspace field, and the rest of the columns are blank.

Note: To change the priority of a job where Customer lockbox is enabled, you must be in a permission group having access to the workspace where the job originated, along with the System Administrator's group. For more information, see Customer lockbox.

Note: Only System Administrators can modify processing jobs on the Processing and Imaging Queue tab. Other users can see the Processing and Imaging Queue tab with instance level permissions, but will have an error thrown when attempting to modify processing jobs.

Groups don't have access to the Processing Administration tab or sub-tabs by default. To grant them access, perform the following steps:

  1. From Home, navigate to the Instance Details sub-tab under kCura Admin.
  2. In the Security box, click Manage Permissions.
  3. In the Admin Security window, select Tab Visibility.
  4. From the drop-down list at the top right, select the group to whom you want to grant access.
  5. Select Processing Administration, Processing and Imaging Queue, and Processing History.
  6. Admin security window for processing admin permissions

  1. Click Save.

You must also have the View Admin Repository permission set in the Admin Operations console in the Instance Details tab to use the Processing Administration tab.

Admin security window for View admin repository permission

Monitoring active jobs

To see all active processing and imaging jobs in the environment, view the Active Jobs view in the Processing and Imaging Queue tab. If no jobs are visible in this view, it means there are no jobs currently running in the environment.

  • Jobs that are running in workspaces to which you don't have permissions display the placeholder text "Item Secured" in the Active Jobs view. Actual job details are not visible. To permit visibility, see Workspace Security.
  • The Workspaces tree on the left only contains workspaces in which an active job is currently running.

The following columns appear on Active Jobs view:

Active jobs pane

  • Workspace – the workspace in which the job was created. Click the name of a workspace to navigate to the main tab in that workspace.
  • Set Name – the name of the processing set. Click a set name to navigate to the Processing Set Layout on the Processing Sets tab. From here you can cancel publishing or edit the processing set.
  • Data Source - the data source containing the files you're processing. This appears as either the name you gave the source when you created it or an artifact ID if you didn't provide a name.
  • Job Type – the type of job running. The worker manager server handles processing and imaging jobs.
  • Note: Filtering jobs aren't represented in the queue.

  • Status – the status of the set. If you're unable to view the status of any processing jobs in your environment, check to make sure the Server Manager agent is running. This field could display any of the following status values:
    • Waiting
    • Canceling
    • Finalizing
    • Unavailable
    • Inventorying
    • Discover
    • Publish
    • Imaging
    • Initializing
    • Retrieving/Retrying Errors
    • Submitting Job
  • Documents Remaining – the number of documents that have yet to be inventoried, discovered, or published. The value in this field goes down incrementally as data extraction progresses on the processing set.

    Note: This column displays a value of -1 if you have clicked Inventory Files, Discover Files, or Publish Files and the job has not yet started.

  • Priority – the order in which jobs in the queue are processed. Lower priority numbers result in higher priority. This is determined by the value of the Order field on the data source. You can change the priority of a data source with the Change Priority button at the bottom of the view. Changing the priority only changes the priority for that immediate job.
    • Resources are split equally between processing sets of the same priority.
    • Note: Resource distribution is also considered at the Workspace level to make sure that all jobs are making progress.

    • Discovery, publishing, and imaging jobs are multi-threaded and can run in parallel, depending on the number of agents available.
    • Job types have the following priorities set by default:
      • Imaging/TIFF-on-the-fly jobs have a priority of 1 by default and will always run first.
      • Publishing jobs have a priority of 90 and will always run after any imaging on the fly jobs and before all other jobs.
      • Inventory, Discovery, Mass Imaging/Imaging Set jobs all have a priority of 100 in the queue. These jobs have resources shared equally as long as they are the same priority.
    • If you have reason for globally setting certain types of jobs to always run at a lower priority, please contact Support.

  • Job Paused - the Yes/No value indicates whether or not the job was paused. A paused job typically occurs if there is an issue with the processing agent. You cannot manually pause a processing job.
  • Paused Time - the time at which the job was paused, based on local time.
  • Failed Attempts - the number of times an automatic retry was attempted and failed.
  • Submitted Date – the date and time the job was submitted, based on local time.
  • Submitted By – the name of the user who submitted the job.
  • Last Activity - the date and time at which a job last communicated to the worker.

At the bottom of the screen, the active jobs mass operations buttons appear.

Active jobs mass operations

A number of mass operations are available on the Active Jobs view.

Active jobs mass operations menu

  • Cancel Imaging Job - cancel an imaging job. If you have processing jobs selected when you click Cancel Imaging Job, the processing jobs are skipped over and are allowed to proceed. When you cancel an imaging job, it signals to the workers to finish their current batch of work, which may take a few minutes.
  • Resume Processing Job - resumes any paused processing jobs that have exceeded the failed retry attempt count. You can resume multiple jobs at the same time. When you select this option, non-processing jobs are skipped, as are jobs that aren't currently paused.
  • Change Priority - change the priority of processing jobs in the queue.
    •  When you click Change Priority, you must enter a new priority value in the Priority field. Then click Change Priority to proceed with change.
      Change priority confirmation window
      • If you change the priority of a publish or republish job, you update the priorities of other publish and republish jobs from the same processing set. This ensures that deduplication is performed in the order designated on the set.
      • When you change the priority of an inventory job, you update the priorities of other inventory jobs from the same processing set. This ensures that filtering files is available as expected for the processing set.
      • While there is no option to pause discovery, changing the priority of a discovery job is a viable alternative.

Using the Processing History sub-tab

To view the details of all processing actions taken on all data sources in the environment, navigate to the Processing History sub-tab.

In the Workspaces tree on the left, you'll see all workspaces in the environment that have at least one published document in them. You can expand the tree and click on processing sets and data sources to filter on them.

If you don't have permissions to a workspace, you'll see an "Item Restricted" message for that workspace.

The Processing History view provides the following fields:

Processing history page

  • Workspace - the name of the workspace in which the processing job was run.
  • Processing Set - the name of the processing set that was run.
  • Processing Data Source - the name and artifact ID of the data source attached to the processing set.
  • Processing Profile - the profile associated with the processing set.
  • Status - the current status of the processing job.
  • Entity - the entity associated with the data source.
  • Source Path - the location of the data that was processed, as specified on the data source.
  • Preprocessed file count - the count of all native files before extraction/decompression, as they exist in storage.
  • Preprocessed file size - the sum of all the native file sizes, in bytes, before extraction/decompression, as they exist in storage.
  • Discovered document size - the sum of all native file sizes discovered, in bytes, that aren’t classified as containers as they exist in storage.
  • Discovered files - the number of files from the data source that were discovered.
  • Nisted file count - the count of all files denisted out during discovery, if deNIST was enabled on the processing profile.
  • Nisted file size - the sum of all the file sizes, in bytes, denisted out during discovery, if deNIST was enabled on the processing profile.
  • Published documents size - the sum of published native file sizes, in bytes, associated to the user, processing set and workspace.
  • Published documents - the count of published native files associated to the user, processing set and workspace.
  • Total file count - the count of all native files (including duplicates and containers) as they exist after decompression and extraction.
  • Total file size - the sum of all native file sizes (including duplicates and containers), in bytes, as they exist after decompression and extraction.
  • Last publish time submitted - the date and time at which publish was last started on the processing set.
  • Discover time submitted - the date and time at which discovery was last started on the processing set.
  • Last activity - the date and time at which any action was taken on the processing set.

You have the option of exporting any available processing history data to a CSV file through the Export to CSV mass operation at the bottom of the view.

Export procesing history bage button

Auto refresh options for processing history

The processing history tab receives processing information when loaded and update every time the page refreshes.

To configure the rate at which the view automatically refresh, select a value from the Auto refresh drop-down at the bottom right of the view.

Autorefresh dropdown for processing history

  • Disabled - prevents the automatic refresh of the view and makes it so that processing history information only updates when you manually refresh the page. This option is useful at times of heavy processing usage, in that it offers you more control over the refresh rate and prevents the information from constantly changing often while you monitor the work being performed. We've set this as the default because if your environment contains many workspaces and data sources, it could take a long time to load all of the data, which you may not want to update on an auto-refresh interval.
  • 30 seconds - arranges for the processing history view to automatically refresh every thirty seconds.
  • 1 minute - arranges for the processing history view to automatically refresh every one minute.
  • 5 minutes - arranges for the processing history view to automatically refresh every five minutes.