Skip to main content

Smart De-duplication in your Library

Elicit automatically detects potential duplicates in papers you upload to your Library. This guide explains how the de-duplication works and how to use it.

How we detect duplicates

Elicit uses semantic search and LLMs (large language models) to identify duplicates within your papers. Instead of just looking for papers with the same keywords in the title or abstract, or for papers with the same DOI, Elicit checks if the content is similar. For example, a paper and its pre-print may have different abstracts and titles but will talk about the same topic. Reference managers with strict rule-based de-duplication wouldn't detect these similarities but Elicit can.

Note: Elicit may not always find every duplicate in your Library, so double check your papers if you need to be 100% sure.

Scanning for duplicates

We automatically look for any duplicates when you upload papers to your Library. To get started, click "Upload" in the top right of your Library. After Elicit finishes enriching your uploaded papers, we will start automatically detecting duplicates. You can track progress using the progress bar in the right sidebar:

image

When Elicit finishes scanning for duplicates, you can click on the "Review duplicates" button in the right sidebar to see and handle potential duplicates.

image

Taking action on duplicates

Each row is a potential duplicate along with its potential original paper from your Library. By default, papers that were uploaded to your Library earlier are used as the potential originals. Fields marked with colors indicate differences in the duplicate that are not in the original source. Changes marked in red won't be included in a merge while items in green will be. In general, merging only adds new information (e.g., the duplicate has a PDF and the original doesn't).

image

You can take four actions for each duplicate pair:

  • Merge: new information from the duplicate gets added to the original. Conflicting information is discarded. You can preview the merge result in the right sidebar.

  • Keep original, delete duplicate: the duplicate is deleted and the original is kept (without merging).

  • Keep duplicate, delete original: the original is deleted and the duplicate is kept (without merging).

  • Keep both: both papers are kept in your Library.

You can also select multiple rows and bulk apply any of these actions.

Automatic detection

Elicit will automatically run duplicate detection on any files you upload to the Library. You don't need to take action on the duplicates right away; they will remain in your Library until you take action on them. Any new duplicates detected will be added to the queue in the duplicates view.

Did this answer your question?