Friday, October 10, 2025

Clowning around with EndNote 2025's AI Research Assistant (clown shenanigans part II!)

 

Balloons

EndNote 2025 now has an AI-powered Research Assistant tool to accompany its Key Takeaway tool. This new feature allows you to ask the Research Assistant (i.e., chatbot) questions, which the Assistant will answer using the content provided in the PDF you attach to a reference. As a note, in order to use the tool EndNote 2025 users must use a library they have synced with the online version of EndNote.

Of course, I couldn't help but resort to my old clown antics to test out the tool, just like I did when testing out the Key Takeaway tool.

For this experiment, I tested whether the Research Assistant could pick up on clown nonsense in a study's methods. To do this, I replaced the abstract and methods sections of one of my own articles about clinical trials information sharing with nonsense about clowns (for a humorous read, you can take a look at the manipulated PDF here), and attached it to its corresponding reference in EndNote. I also replaced the abstract metadata in the EndNote record with clown nonsense. After syncing my library, I then entered the following question into the Research Assistant, "What methods were used to determine the extent of data sharing?"

Theoretically, the Research Assistant should have either stated the authors didn't specify their methods for determining extent of data sharing, or mentioned something along the lines of balloon animal residue analysis and unsolicited honking detection, but it did nothing of the sort. Instead, it appeared to draw data from the results and introduction sections to inform its methods summary. It didn't mention clowns, at all, even though all sections relating to the methods had been replaced with clown nonsense.

 Methods used to determine data sharing The study analyzed 206 ClinicalTrials.gov (CTG) records of US federally funded COVID-19 clinical studies that included at least 200 participants It evaluated the records for links to supporting documents, individual participant data (IPD) sharing information, and associated publications The researchers looked for specific sections in the CTG records that indicated the presence of protocols, informed consent forms, and statistical analysis plans. Data was collected on how many records provided links to these materials and the mechanisms for accessing IPD, as well as any discrepancies in the information provided.

Here is the output of the Research Assistant, as the screenshot can be a bit hard to read:

  • Methods used to determine data sharing
  • The study analyzed 206 ClinicalTrials.gov (CTG) records of US federally funded COVID-19 clinical studies that included at least 200 participants
  • It evaluated the records for links to supporting documents, individual participant data (IPD) sharing information, and associated publications
  • The researchers looked for specific sections in the CTG records that indicated the presence of protocols, informed consent forms, and statistical analysis plans.
  • Data was collected on how many records provided links to these materials and the mechanisms for accessing IPD, as well as any discrepancies in the information provided.

To be fair, the Research Assistant did pick up on clown nonsense when you asked it to "Summarize the methods used for this study."

 Methods summary Study design: the research employed a multi-phase, interdisciplinary approach to investigate the discoverability of supporting research materials in COVID-19 clinical studies on ClinicalTrials.gov. Data Collection:  Behavioral surveillance: Observers were trained to recognize clown behaviors, using tools like binoculars and audio recorders to monitor high-traffic areas for clown-like activities. Material forensics: Environmental samples were collected from locations identified as high clown activity zones, focusing on traces like wig fibers and greasepaint residue. Psychological profiling: Participants were assessed using three instruments to gauge their clown potential, including a questionnaire and observational tests. Data Analysis:  Observations and samples were catalogued and scored using the Clown Threat Index (CTI), which quantified behaviors to identify individuals with clown-like tendencies. Ethics and oversight: The study was approved by the Institutional Review Board, ensuring ethical standards in data collection and participant interaction.

Here is the output for that query, as well:

  • Methods summary
    • Study design: the research employed a multi-phase, interdisciplinary approach to investigate the discoverability of supporting research materials in COVID-19 clinical studies on ClinicalTrials.gov.
    • Data Collection:
      • Behavioral surveillance: Observers were trained to recognize clown behaviors, using tools like binoculars and audio recorders to monitor high-traffic areas for clown-like activities.
      • Material forensics: Environmental samples were collected from locations identified as high clown activity zones, focusing on traces like wig fibers and greasepaint residue.
      • Psychological profiling: Participants were assessed using three instruments to gauge their clown potential, including a questionnaire and observational tests.
    • Data Analysis: Observations and samples were catalogued and scored using the Clown Threat Index (CTI), which quantified behaviors to identify individuals with clown-like tendencies.
    • Ethics and oversight: The study was approved by the Institutional Review Board, ensuring ethical standards in data collection and participant interaction.

While I hope it would be unlikely authors would replace sections relating to their methods with utter nonsense, this little experiment demonstrates the limitations of AI PDF summarizers, namely how they can miss vital context within the full text. Researchers wanting to save time with these summarizers may miss (potentially critical!) methodological flaws in a study (such as in this case, where there was a mismatch, to say the least, between the methods and the aims of the study). While PDF summarization tools and chatbots can be convenient, researchers should exercise caution if they decide to utilize such tools, and always verify information by examining the full text. 

For some additional resources relating to generative AI in the health sciences, check out:

Thanks for reading, and I hope everyone has a great weekend! 

No comments:

Post a Comment