Release 9.1

11 April 2024 by Aaron Ballagh in Release Notes

Release 9.1
New retraction data, PatSeq Data and PatSeq Finder updates, improved Tags and more!

New Features and Improvements

Retraction Watch Data

With this release we are pleased to announce the addition of Retraction Watch data via Crossref, adding additional retraction event information and fields. This has increased the number of retracted works in the Lens to 45,520 works.

The retracted status is now being calculated from the retraction update events. To avoid false-positive retractions, all retraction events are sorted by date, and a retraction event is required to occur without subsequent reinstatement events. This helps ensure reinstated works are not set to retracted, eg. https://link.lens.org/Cz5YmO1Utwh. For works without a retraction event, the retracted status is not set to true unless it is set by another data source e.g. PubMed, Microsoft Academic or OpenAlex.

The associated changes include:

  • Added a Retracted filter to the Flag filters.
  • Added a Retracted pill to retracted works in the results list and on individual scholarly works.
  • Added Retraction information box including publisher retraction URLs, reasons, notes and dates.
  • Added new retraction update searchable fields including:
    • Retraction Update Date (retraction_update.date): The date of the retraction update. Data source: Crossref/Retraction Watch.
    • Retraction Update Nature (retraction_update.nature): The nature of the retraction update (e.g. Retraction, Expression of Concern, Correction, Reinstatement).
    • Retraction Update Reason (retraction_update.reason): The reason for the retraction update (e.g. investigation by journal/publisher, notice - limited or no information, concerns/issues about data, unreliable results, investigation by third party, Falsification/Fabrication of Image, etc.).

Combined with the analytical functions of The Lens, the new retraction filter and fields allow you to identify and explore works/patents that have cited retracted works through the Explore Citations tab e.g. scholarly works citing retracted works, patents citing retracted works.

PatSeq Updates

We are also pleased to announce a number of significant updates to the PatSeq facility with this release:

  • Added sequences in the WIPO ST.26 standard, increasing the number of sequences in PatSeq data from 460M to 490M+.
  • Upgraded the PatSeq Finder BLAST server to BLAST+ 2.15.0.
  • Added LensId field to the Word/Excel PatSeq Finder exports.
  • Added pagination controls for PatSeq Finder results.
  • Updated the format of the patent keys and sequence identifiers used in PatSeq bulk data files.

Improved Tags Functionality

The functionality of user Tags has also been improved with this release. Adding an existing Tag to a record is now easier than ever with an autosuggest/lookup for the user’s Tags list. The Tags filtering has also been improved and deletion of tags with reserved characters has been fixed.

Other Improvements and Changes

The Lens is a work in progress and with each release The Lens team improves on certain features or perfects the quality of the data and the service to provide you with the most comprehensive data we can source. While often these changes stem from users' feedback, they also originate from improved data processing and improvements in the source data. In this release, here is the list of other improvements and changes that were implemented:

  • Added metrics carousel expander for viewing detailed metrics descriptions.
  • Added Akal University, the National Institute of Technology Calicut and the University of Rochester to the Registry of subscribing institutions and supporters.
  • Added abstract match snippet tooltip to search results in tabular view.
  • Report builder enhancements for better image handling and display and row layout and design.
  • Improved the preprint publication type identification rules, increasing the number of preprints from 1.2M to 2.9M, a 146% (1.7M) increase in the number of preprint publication type scholarly works. These have come mostly from scholarly works previously identified as unknown (1M) and journal article (0.7M) publication types.
  • Adjusted Open Access License data integration, reducing the unknown open-access license type for 3.2M scholarly works.
  • Updated the Source URL field for scholarly works, combining all URLs from available data sources into the Source URL field adding OpenAlex location URLs as well as DataCite URLs with a uniqueness check.
  • Updated the scholarly works data source diagram on the dataset tab of scholarly works structured search page and made the Venn diagram clickable.
  • Improved US assignments conveyance type mapping, mapping an additional 155K assignment types and improving the ownership history calculations.
  • Added request rate-limiting to user account related endpoints to prevent brute force attacks and spamming of users via password reset and account activation.
  • Added the Cited Category to the Cited Patents tab of individual patents.
  • Updated WorldCat and DOI linkouts.
  • Updated anticipation termination date and legal status calculations, recently granted patents in which the earliest priority date is >20 years old and past the anticipation termination date are now categorised as Expired.
  • Added the Citation Categories to the cited patents on individual patent records.
  • Added the Cloudflare Turnstile widget to Oauth endpoints.
  • Updated the example Lens Reports adding three new prototype institution reports for the National Institutes of Health, Natural Sciences and Engineering Research Council of Canada and European Molecular Biology Laboratory.

Fixes

  • Fix highlighting of search terms in patent claims and full text.
  • Fix Journal is Open Access values in the publication information box of individual scholarly works.
  • Fix the open access sources in the tooltip on the open access information box of individual scholarly works.
  • Fix alias Lens Ids being used in patent family members, this fix will be applied in data release 202414.
  • Fix password reset issues for users who registered using a linked service (ORCID or LinkedIn).
  • Fix notifications for saved queries containing a patent family filter.

Policy Updates

The Individual Commercial License has been updated, clause 7.3 New IP Rights has been removed.

API & Data Improvements

Patent and Scholarly Works data::

  • Added Classification Value and Classification Symbol Position fields to the Classification Symbols object in the patent API response and JSON Lines exports:
    • classification_value: I - Invention, L - Later. Applies to CPC and IPRC Classifications only.
    • classification_symbol_position: F - First, A - Additional. Applies to CPC and IPRC Classifications only.
  • Added Sequence Listing metadata to the patent API response and JSON Lines exports. The Sequence Listing fields include:
    • sequence_types: The type of sequences listed on the patent document. e.g. N - nucleotide (including DNA and RNA sub-types), P - peptides/proteins.
    • length_buckets: Preset sequence length ranges (nucleotide: “0-100”, “101-5000”, “5001-100k”, “>100k”; Peptide: “0-50”, “51-300”, “>300”).
    • organisms: List of declared organisms associated with the sequences listed on the patent document.
      • tax_id: The NCBI taxonomic identifier of the declared organism.
      • name: The name of the declared organism.
    • count: The number of sequences listed on the patent document.
  • Added the cited patent Citation Phase and Citation Category fields to the searchable patent fields:
    • reference_cited.patent.cited_phase: The application phase that a cited patent was added to a patent document. Citation phase values include SEA, ISR, SUP, PRS, APP, EXA, OPP, APL, FOP and TPO. Citation phase definitions are available here.
    • reference_cited.patent.category: Cited patent documents are identified by letter(s) indicating the category of the cited document. Citation category letters include X, I, Y, A, O, P, T, E, D, L and R. Citation category definitions are available here.
  • Added the Citation Category field to the Citation object in the patent API response and JSON Lines exports for cited patents.
    • category: Cited patent documents are identified by letter(s) indicating the category of the cited document. Citation category letters include X, I, Y, A, O, P, T, E, D, L and R. Citation category definitions are available here.
  • Added the Data Source field to the searchable scholarly works fields:
    • data_source.name: The data source(s) for the scholarly work (mag, pubmed, crossref, openalex, publisher).
  • Updated the Classification Explorer to use the latest CPC schema (version 2024.01).
  • Improved bulk data processing and preparation for migration to delta processing.

Lens API (version 2.11.0):

  • Added API handling for multiple scroll contexts and added support for point-in-time/search after scrolling. Users should see no changes and can continue using the existing cursor-based pagination method. Note: API queries are stored in cache for the duration of the scroll Id only.
  • Bulk data API users can now view their bulk file downloads in the usage endpoint.

For guidance navigating the API, please check the API Documentation, and to report bugs/issues, usability related questions, or to request features, please use our GitHub issue tracker.

Interested in supporting the sustainability of The Lens?

  • Contact your library, research office or technology transfer and commercialization teams, and let them know about Lens for Institutions and the Institutional Toolkits.
  • If you need to use Lens.org for commercial uses, please comply with Lens Terms of Use by either subscribing to an Enterprise Toolkit, or if you are a sole practitioner or SME, an individual commercial use license. See account and pricing options. If you need more than 4 seats or have other special circumstances, please contact us at support@lens.org.
  • Do you know someone who would be interested in The Lens? Share a search or collection with them, or suggest that they register for our newsletter here.

Interested in helping us improve?

  • Collective Action needs connected actors. The Collective Action Project provides a number of different activities for all stakeholders to participate in collective problem solving. See how you can participate.
  • Are you supportive of The Lens mission and interested in volunteering your expertise and time to support The Lens? We welcome expressions of interest. Please contact us at support@lens.org. We are happy to learn more about your skill sets and see if there is a good fit of skills and timing to contribute to The Lens.
  • If you have published using Lens tools or data, or created an application using the API or data from the Lens, please share your work with us so we can share it with others on Lens Labs to help improve The Lens as a public resource.
  • Share your use case with us and explore how The Lens can help you and keep sending us your feedback on The Lens features and functionality.

The Lens is a public good project run by Cambia, a global social enterprise that is committed to making the innovation process more transparent, inclusive and effective for those seeking to solve problems and make a social impact and thus, we welcome your active engagement, participation, and support.