%0 Journal Article %@ 2369-1999 %I JMIR Publications %V 11 %N %P e69663 %T Identifying Adverse Events in Outpatients With Prostate Cancer Using Pharmaceutical Care Records in Community Pharmacies: Application of Named Entity Recognition %A Yanagisawa,Yuki %A Watabe,Satoshi %A Yokoyama,Sakura %A Sayama,Kyoko %A Kizaki,Hayato %A Tsuchiya,Masami %A Imai,Shungo %A Someya,Mitsuhiro %A Taniguchi,Ryoo %A Yada,Shuntaro %A Aramaki,Eiji %A Hori,Satoko %+ , Division of Drug Informatics, Keio University Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo, 105-8512, Japan, 81 3 5400 2650, satokoh@keio.jp %K natural language processing %K pharmaceutical care records %K androgen receptor axis-targeting agents %K adverse events %K outpatient care %D 2025 %7 11.3.2025 %9 Original Paper %J JMIR Cancer %G English %X Background: Androgen receptor axis-targeting reagents (ARATs) have become key drugs for patients with castration-resistant prostate cancer (CRPC). ARATs are taken long term in outpatient settings, and effective adverse event (AE) monitoring can help prolong treatment duration for patients with CRPC. Despite the importance of monitoring, few studies have identified which AEs can be captured and assessed in community pharmacies, where pharmacists in Japan dispense medications, provide counseling, and monitor potential AEs for outpatients prescribed ARATs. Therefore, we anticipated that a named entity recognition (NER) system might be used to extract AEs recorded in pharmaceutical care records generated by community pharmacists. Objective: This study aimed to evaluate whether an NER system can effectively and systematically identify AEs in outpatients undergoing ARAT therapy by reviewing pharmaceutical care records generated by community pharmacists, focusing on assessment notes, which often contain detailed records of AEs. Additionally, the study sought to determine whether outpatient pharmacotherapy monitoring can be enhanced by using NER to systematically collect AEs from pharmaceutical care records. Methods: We used an NER system based on the widely used Japanese medical term extraction system MedNER-CR-JA, which uses Bidirectional Encoder Representations from Transformers (BERT). To evaluate its performance for pharmaceutical care records by community pharmacists, the NER system was first applied to 1008 assessment notes in records related to anticancer drug prescriptions. Three pharmaceutically proficient researchers compared the results with the annotated notes assigned symptom tags according to annotation guidelines and evaluated the performance of the NER system on the assessment notes in the pharmaceutical care records. The system was then applied to 2193 assessment notes for patients prescribed ARATs. Results: The F1-score for exact matches of all symptom tags between the NER system and annotators was 0.72, confirming the NER system has sufficient performance for application to pharmaceutical care records. The NER system automatically assigned 1900 symptom tags for the 2193 assessment notes from patients prescribed ARATs; 623 tags (32.8%) were positive symptom tags (symptoms present), while 1067 tags (56.2%) were negative symptom tags (symptoms absent). Positive symptom tags included ARAT-related AEs such as “pain,” “skin disorders,” “fatigue,” and “gastrointestinal symptoms.” Many other symptoms were classified as serious AEs. Furthermore, differences in symptom tag profiles reflecting pharmacists’ AE monitoring were observed between androgen synthesis inhibition and androgen receptor signaling inhibition. Conclusions: The NER system successfully extracted AEs from pharmaceutical care records of patients prescribed ARATs, demonstrating its potential to systematically track the presence and absence of AEs in outpatients. Based on the analysis of a large volume of pharmaceutical medical records using the NER system, community pharmacists not only detect potential AEs but also actively monitor the absence of severe AEs, offering valuable insights for the continuous improvement of patient safety management. %M 40068144 %R 10.2196/69663 %U https://cancer.jmir.org/2025/1/e69663 %U https://doi.org/10.2196/69663 %U http://www.ncbi.nlm.nih.gov/pubmed/40068144 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e56774 %T Reporting Quality of AI Intervention in Randomized Controlled Trials in Primary Care: Systematic Review and Meta-Epidemiological Study %A Zhong,Jinjia %A Zhu,Ting %A Huang,Yafang %+ , School of General Practice and Continuing Education, Capital Medical University, 4th Fl, Jieping Building, Capital Medical University, No.10 You An Men Wai Xi Tou Tiao, Fengtai district, Beijing, 100069, China, 86 18810673886, yafang@ccmu.edu.cn %K artificial intelligence %K randomized controlled trial %K reporting quality %K primary care %K meta-epidemiological study %D 2025 %7 25.2.2025 %9 Review %J J Med Internet Res %G English %X Background: The surge in artificial intelligence (AI) interventions in primary care trials lacks a study on reporting quality. Objective: This study aimed to systematically evaluate the reporting quality of both published randomized controlled trials (RCTs) and protocols for RCTs that investigated AI interventions in primary care. Methods: PubMed, Embase, Cochrane Library, MEDLINE, Web of Science, and CINAHL databases were searched for RCTs and protocols on AI interventions in primary care until November 2024. Eligible studies were published RCTs or full protocols for RCTs exploring AI interventions in primary care. The reporting quality was assessed using CONSORT-AI (Consolidated Standards of Reporting Trials–Artificial Intelligence) and SPIRIT-AI (Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence) checklists, focusing on AI intervention–related items. Results: A total of 11,711 records were identified. In total, 19 published RCTs and 21 RCT protocols for 35 trials were included. The overall proportion of adequately reported items was 65% (172/266; 95% CI 59%-70%) and 68% (214/315; 95% CI 62%-73%) for RCTs and protocols, respectively. The percentage of RCTs and protocols that reported a specific item ranged from 11% (2/19) to 100% (19/19) and from 10% (2/21) to 100% (21/21), respectively. The reporting of both RCTs and protocols exhibited similar characteristics and trends. They both lack transparency and completeness, which can be summarized in three aspects: without providing adequate information regarding the input data, without mentioning the methods for identifying and analyzing performance errors, and without stating whether and how the AI intervention and its code can be accessed. Conclusions: The reporting quality could be improved in both RCTs and protocols. This study helps promote the transparent and complete reporting of trials with AI interventions in primary care. %M 39998876 %R 10.2196/56774 %U https://www.jmir.org/2025/1/e56774 %U https://doi.org/10.2196/56774 %U http://www.ncbi.nlm.nih.gov/pubmed/39998876 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e63476 %T Discovering Time-Varying Public Interest for COVID-19 Case Prediction in South Korea Using Search Engine Queries: Infodemiology Study %A Ahn,Seong-Ho %A Yim,Kwangil %A Won,Hyun-Sik %A Kim,Kang-Min %A Jeong,Dong-Hwa %+ Department of Artificial Intelligence, The Catholic University of Korea, Jibong-Ro 43 3-1, Bucheon-Si, Republic of Korea, 82 2 2164 5564, kangmin89@catholic.ac.kr %K COVID-19 %K confirmed case prediction %K search engine queries %K query expansion %K word embedding %K public health %K case prediction %K South Korea %K search engine %K infodemiology %K infodemiology study %K policy %K lifestyle %K machine learning %K machine learning techniques %K utilization %K temporal variation %K novel framework %K temporal %K web-based search %K temporal semantics %K prediction model %K model %D 2024 %7 16.12.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: The number of confirmed COVID-19 cases is a crucial indicator of policies and lifestyles. Previous studies have attempted to forecast cases using machine learning techniques that use a previous number of case counts and search engine queries predetermined by experts. However, they have limitations in reflecting temporal variations in queries associated with pandemic dynamics. Objective: This study aims to propose a novel framework to extract keywords highly associated with COVID-19, considering their temporal occurrence. We aim to extract relevant keywords based on pandemic variations using query expansion. Additionally, we examine time-delayed web-based search behavior related to public interest in COVID-19 and adjust for better prediction performance. Methods: To capture temporal semantics regarding COVID-19, word embedding models were trained on a news corpus, and the top 100 words related to “Corona” were extracted over 4-month windows. Time-lagged cross-correlation was applied to select optimal time lags correlated to confirmed cases from the expanded queries. Subsequently, ElasticNet regression models were trained after reducing the feature dimensions using principal component analysis of the time-lagged features to predict future daily case counts. Results: Our approach successfully extracted relevant keywords depending on the pandemic phase, encompassing keywords directly related to COVID-19, such as its symptoms, and its societal impact. Specifically, during the first outbreak, keywords directly linked to COVID-19 and past infectious disease outbreaks similar to those of COVID-19 exhibited a high positive correlation. In the second phase of the pandemic, as community infections emerged, keywords related to the government’s pandemic control policies were frequently observed with a high positive correlation. In the third phase of the pandemic, during the delta variant outbreak, keywords such as “economic crisis” and “anxiety” appeared, reflecting public fatigue. Consequently, prediction models trained by the extracted queries over 4-month windows outperformed previous methods for most predictions 1-14 days ahead. Notably, our approach showed significantly higher Pearson correlation coefficients than models based solely on the number of past cases for predictions 9-11 days ahead (P=.02, P<.01, and P<.01), in contrast to heuristic- and symptom-based query sets. Conclusions: This study proposes a novel COVID-19 case-prediction model that automatically extracts relevant queries over time using word embedding. The model outperformed previous methods that relied on static symptom-based or heuristic queries, even without prior expert knowledge. The results demonstrate the capability of our approach to track temporal shifts in public interest regarding changes in the pandemic. %M 39680913 %R 10.2196/63476 %U https://www.jmir.org/2024/1/e63476 %U https://doi.org/10.2196/63476 %U http://www.ncbi.nlm.nih.gov/pubmed/39680913 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e55856 %T Screening for Depression and Anxiety Using a Nonverbal Working Memory Task in a Sample of Older Brazilians: Observational Study of Preliminary Artificial Intelligence Model Transferability %A Georgescu,Alexandra Livia %A Cummins,Nicholas %A Molimpakis,Emilia %A Giacomazzi,Eduardo %A Rodrigues Marczyk,Joana %A Goria,Stefano %K depression %K anxiety %K Brazil %K machine learning %K n-back %K working memory %K artificial intelligence %K gerontology %K older adults %K mental health %K AI %K transferability %K detection %K screening %K questionnaire %K longitudinal study %D 2024 %7 12.12.2024 %9 %J JMIR Form Res %G English %X Background: Anxiety and depression represent prevalent yet frequently undetected mental health concerns within the older population. The challenge of identifying these conditions presents an opportunity for artificial intelligence (AI)–driven, remotely available, tools capable of screening and monitoring mental health. A critical criterion for such tools is their cultural adaptability to ensure effectiveness across diverse populations. Objective: This study aims to illustrate the preliminary transferability of two established AI models designed to detect high depression and anxiety symptom scores. The models were initially trained on data from a nonverbal working memory game (1- and 2-back tasks) in a dataset by thymia, a company that develops AI solutions for mental health and well-being assessments, encompassing over 6000 participants from the United Kingdom, United States, Mexico, Spain, and Indonesia. We seek to validate the models’ performance by applying it to a new dataset comprising older Brazilian adults, thereby exploring its transferability and generalizability across different demographics and cultures. Methods: A total of 69 Brazilian participants aged 51-92 years old were recruited with the help of Laços Saúde, a company specializing in nurse-led, holistic home care. Participants received a link to the thymia dashboard every Monday and Thursday for 6 months. The dashboard had a set of activities assigned to them that would take 10-15 minutes to complete, which included a 5-minute game with two levels of the n-back tasks. Two Random Forest models trained on thymia data to classify depression and anxiety based on thresholds defined by scores of the Patient Health Questionnaire (8 items) (PHQ-8) ≥10 and those of the Generalized Anxiety Disorder Assessment (7 items) (GAD-7) ≥10, respectively, were subsequently tested on the Laços Saúde patient cohort. Results: The depression classification model exhibited robust performance, achieving an area under the receiver operating characteristic curve (AUC) of 0.78, a specificity of 0.69, and a sensitivity of 0.72. The anxiety classification model showed an initial AUC of 0.63, with a specificity of 0.58 and a sensitivity of 0.64. This performance surpassed a benchmark model using only age and gender, which had AUCs of 0.47 for PHQ-8 and 0.53 for GAD-7. After recomputing the AUC scores on a cross-sectional subset of the data (the first n-back game session), we found AUCs of 0.79 for PHQ-8 and 0.76 for GAD-7. Conclusions: This study successfully demonstrates the preliminary transferability of two AI models trained on a nonverbal working memory task, one for depression and the other for anxiety classification, to a novel sample of older Brazilian adults. Future research could seek to replicate these findings in larger samples and other cultural contexts. Trial Registration: ISRCTN Registry ISRCTN90727704; https://www.isrctn.com/ISRCTN90727704 %R 10.2196/55856 %U https://formative.jmir.org/2024/1/e55856 %U https://doi.org/10.2196/55856 %0 Journal Article %@ 2563-3570 %I JMIR Publications %V 5 %N %P e62747 %T Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach %A Campbell,Amy Marie %A Hauton,Chris %A van Aerle,Ronny %A Martinez-Urtaza,Jaime %+ Department of Genetics and Microbiology, Autonomous University of Barcelona, Facultat de Biociènces, oficina C3/109, Campus de la UAB, Bellaterra, Barcelona, 08193, Spain, 34 93 581 2729, jaime.martinez.urtaza@uab.cat %K pathogen expansion %K climate change %K machine learning %K ecology %K evolution %K vibrio parahaemolyticus %K sequencing %K sequence type 3 %K VpST3 %K genomics %D 2024 %7 28.11.2024 %9 Original Paper %J JMIR Bioinform Biotech %G English %X Background: Environmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions. Objective: The unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion. Methods: The global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics. Results: We identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance. Conclusions: These novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens. %M 39607996 %R 10.2196/62747 %U https://bioinform.jmir.org/2024/1/e62747 %U https://doi.org/10.2196/62747 %U http://www.ncbi.nlm.nih.gov/pubmed/39607996 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e64844 %T Comparative Analysis of Diagnostic Performance: Differential Diagnosis Lists by LLaMA3 Versus LLaMA2 for Case Reports %A Hirosawa,Takanobu %A Harada,Yukinori %A Tokumasu,Kazuki %A Shiraishi,Tatsuya %A Suzuki,Tomoharu %A Shimizu,Taro %+ Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, 880 Kitakobayashi, Mibu-cho, Shimotsuga, 321-0293, Japan, 81 0282861111, hirosawa@dokkyomed.ac.jp %K artificial intelligence %K clinical decision support system %K generative artificial intelligence %K large language models %K natural language processing %K NLP %K AI %K clinical decision making %K decision support %K decision making %K LLM: diagnostic %K case report %K diagnosis %K generative AI %K LLaMA %D 2024 %7 19.11.2024 %9 Original Paper %J JMIR Form Res %G English %X Background: Generative artificial intelligence (AI), particularly in the form of large language models, has rapidly developed. The LLaMA series are popular and recently updated from LLaMA2 to LLaMA3. However, the impacts of the update on diagnostic performance have not been well documented. Objective: We conducted a comparative evaluation of the diagnostic performance in differential diagnosis lists generated by LLaMA3 and LLaMA2 for case reports. Methods: We analyzed case reports published in the American Journal of Case Reports from 2022 to 2023. After excluding nondiagnostic and pediatric cases, we input the remaining cases into LLaMA3 and LLaMA2 using the same prompt and the same adjustable parameters. Diagnostic performance was defined by whether the differential diagnosis lists included the final diagnosis. Multiple physicians independently evaluated whether the final diagnosis was included in the top 10 differentials generated by LLaMA3 and LLaMA2. Results: In our comparative evaluation of the diagnostic performance between LLaMA3 and LLaMA2, we analyzed differential diagnosis lists for 392 case reports. The final diagnosis was included in the top 10 differentials generated by LLaMA3 in 79.6% (312/392) of the cases, compared to 49.7% (195/392) for LLaMA2, indicating a statistically significant improvement (P<.001). Additionally, LLaMA3 showed higher performance in including the final diagnosis in the top 5 differentials, observed in 63% (247/392) of cases, compared to LLaMA2’s 38% (149/392, P<.001). Furthermore, the top diagnosis was accurately identified by LLaMA3 in 33.9% (133/392) of cases, significantly higher than the 22.7% (89/392) achieved by LLaMA2 (P<.001). The analysis across various medical specialties revealed variations in diagnostic performance with LLaMA3 consistently outperforming LLaMA2. Conclusions: The results reveal that the LLaMA3 model significantly outperforms LLaMA2 per diagnostic performance, with a higher percentage of case reports having the final diagnosis listed within the top 10, top 5, and as the top diagnosis. Overall diagnostic performance improved almost 1.5 times from LLaMA2 to LLaMA3. These findings support the rapid development and continuous refinement of generative AI systems to enhance diagnostic processes in medicine. However, these findings should be carefully interpreted for clinical application, as generative AI, including the LLaMA series, has not been approved for medical applications such as AI-enhanced diagnostics. %M 39561356 %R 10.2196/64844 %U https://formative.jmir.org/2024/1/e64844 %U https://doi.org/10.2196/64844 %U http://www.ncbi.nlm.nih.gov/pubmed/39561356 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e60373 %T Optimizing a Classification Model to Evaluate Individual Susceptibility in Noise-Induced Hearing Loss: Cross-Sectional Study %A Li,Shiyuan %A Yu,Xiao %A Ma,Xinrong %A Wang,Ying %A Guo,Junjie %A Wang,Jiping %A Shen,Wenxin %A Dong,Hongyu %A Salvi,Richard %A Wang,Hui %A Yin,Shankai %K noise-induced hearing loss %K susceptible %K resistance %K machine learning algorithms %K linear regression %K extended high frequencies %K phenotypic characteristics %K genetic heterogeneity %D 2024 %7 14.11.2024 %9 %J JMIR Public Health Surveill %G English %X Background: Noise-induced hearing loss (NIHL), one of the leading causes of hearing loss in young adults, is a major health care problem that has negative social and economic consequences. It is commonly recognized that individual susceptibility largely varies among individuals who are exposed to similar noise. An objective method is, therefore, needed to identify those who are extremely sensitive to noise-exposed jobs to prevent them from developing severe NIHL. Objective: This study aims to determine an optimal model for detecting individuals susceptible or resistant to NIHL and further explore phenotypic traits uniquely associated with their susceptibility profiles. Methods: Cross-sectional data on hearing loss caused by occupational noise were collected from 2015 to 2021 at shipyards in Shanghai, China. Six methods were summarized from the literature review and applied to evaluate their classification performance for susceptibility and resistance of participants to NIHL. A machine learning (ML)–based diagnostic model using frequencies from 0.25 to 12 kHz was developed to determine the most reliable frequencies, considering accuracy and area under the curve. An optimal method with the most reliable frequencies was then constructed to detect individuals who were susceptible versus resistant to NIHL. Phenotypic characteristics such as age, exposure time, cumulative noise exposure, and hearing thresholds (HTs) were explored to identify these groups. Results: A total of 6276 participants (median age 41, IQR 33‐47 years; n=5372, 85.6% men) were included in the analysis. The ML-based NIHL diagnostic model with misclassified subjects showed the best performance for identifying workers in the NIHL-susceptible group (NIHL-SG) and NIHL-resistant group (NIHL-RG). The mean HTs at 4 and 12.5 kHz showed the highest predictive value for detecting those in the NIHL-SG and NIHL-RG (accuracy=0.78 and area under the curve=0.81). Individuals in the NIHL-SG selected by the optimized model were younger than those in the NIHL-RG (median 28, IQR 25‐31 years vs median 35, IQR 32‐39 years; P<.001), with a shorter duration of noise exposure (median 5, IQR 2‐8 years vs median 8, IQR 4‐12 years; P<.001) and lower cumulative noise exposure (median 90, IQR 86‐92 dBA-years vs median 92.2, IQR 89.2‐94.7 dBA-years; P<.001) but greater HTs (4 and 12.5 kHz; median 58.8, IQR 53.8‐63.8 dB HL vs median 8.8, IQR 7.5‐11.3 dB HL; P<.001). Conclusions: An ML-based NIHL diagnostic model with misclassified subjects using the mean HTs of 4 and 12.5 kHz was the most reliable method for identifying individuals susceptible or resistant to NIHL. However, further studies are needed to determine the genetic factors that govern NIHL susceptibility. Trial Registration: Chinese Clinical Trial Registry ChiCTR-RPC-17012580; https://www.chictr.org.cn/showprojEN.html?proj=21399 %R 10.2196/60373 %U https://publichealth.jmir.org/2024/1/e60373 %U https://doi.org/10.2196/60373 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 13 %N %P e53447 %T Using a Device-Free Wi-Fi Sensing System to Assess Daily Activities and Mobility in Low-Income Older Adults: Protocol for a Feasibility Study %A Chung,Jane %A Pretzer-Aboff,Ingrid %A Parsons,Pamela %A Falls,Katherine %A Bulut,Eyuphan %+ Nell Hodgson Woodruff School of Nursing, Emory University, 1520 Clifton Road NE, Atlanta, GA, 30322, United States, 1 4047277980, jane.chung@emory.edu %K Wi-Fi sensing %K dementia %K mild cognitive impairment %K older adults %K health disparities %K in-home activities %K mobility %K machine learning %D 2024 %7 12.11.2024 %9 Protocol %J JMIR Res Protoc %G English %X Background: Older adults belonging to racial or ethnic minorities with low socioeconomic status are at an elevated risk of developing dementia, but resources for assessing functional decline and detecting cognitive impairment are limited. Cognitive impairment affects the ability to perform daily activities and mobility behaviors. Traditional assessment methods have drawbacks, so smart home technologies (SmHT) have emerged to offer objective, high-frequency, and remote monitoring. However, these technologies usually rely on motion sensors that cannot identify specific activity types. This group often lacks access to these technologies due to limited resources and technology experience. There is a need to develop new sensing technology that is discreet, affordable, and requires minimal user engagement to characterize and quantify various in-home activities. Furthermore, it is essential to explore the feasibility of developing machine learning (ML) algorithms for SmHT through collaborations between clinical researchers and engineers and involving minority, low-income older adults for novel sensor development. Objective: This study aims to examine the feasibility of developing a novel channel state information–based device-free, low-cost Wi-Fi sensing system, and associated ML algorithms for localizing and recognizing different patterns of in-home activities and mobility in residents of low-income senior housing with and without mild cognitive impairment. Methods: This feasibility study was conducted in collaboration with a wellness care group, which serves the healthy aging needs of low-income housing residents. Prior to this feasibility study, we conducted a pilot study to collect channel state information data from several activity scenarios (eg, sitting, walking, and preparing meals) using the proposed Wi-Fi sensing system continuously over a week in apartments of low-income housing residents. These activities were videotaped to generate ground truth annotations to test the accuracy of the ML algorithms derived from the proposed system. Using qualitative individual interviews, we explored the acceptability of the Wi-Fi sensing system and implementation barriers in the low-income housing setting. We use the same study protocol for the proposed feasibility study. Results: The Wi-Fi sensing system deployment began in November 2022, with participant recruitment starting in July 2023. Preliminary results will be available in the summer of 2025. Preliminary results are focused on the feasibility of developing ML models for Wi-Fi sensing–based activity and mobility assessment, community-based recruitment and data collection, ground truth, and older adults’ Wi-Fi sensing technology acceptance. Conclusions: This feasibility study can make a contribution to SmHT science and ML capabilities for early detection of cognitive decline among socially vulnerable older adults. Currently, sensing devices are not readily available to this population due to cost and information barriers. Our sensing device has the potential to identify individuals at risk for cognitive decline by assessing their level of physical function by tracking their in-home activities and mobility behaviors, at a low cost. International Registered Report Identifier (IRRID): DERR1-10.2196/53447 %M 39531268 %R 10.2196/53447 %U https://www.researchprotocols.org/2024/1/e53447 %U https://doi.org/10.2196/53447 %U http://www.ncbi.nlm.nih.gov/pubmed/39531268 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e58413 %T Development and Validation of Deep Learning–Based Infectivity Prediction in Pulmonary Tuberculosis Through Chest Radiography: Retrospective Study %A Chung,Wou young %A Yoon,Jinsik %A Yoon,Dukyong %A Kim,Songsoo %A Kim,Yujeong %A Park,Ji Eun %A Kang,Young Ae %+ Department of Internal Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea, 82 2 2228 1954, mdkang@yuhs.ac %K pulmonary tuberculosis %K chest radiography %K artificial intelligence %K tuberculosis %K TB %K smear %K smear test %K culture test %K diagnosis %K treatment %K deep learning %K CXR %K PTB %K management %K cost effective %K asymptomatic infection %K diagnostic tools %K infectivity %K AI tool %K cohort %D 2024 %7 7.11.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: Pulmonary tuberculosis (PTB) poses a global health challenge owing to the time-intensive nature of traditional diagnostic tests such as smear and culture tests, which can require hours to weeks to yield results. Objective: This study aimed to use artificial intelligence (AI)–based chest radiography (CXR) to evaluate the infectivity of patients with PTB more quickly and accurately compared with traditional methods such as smear and culture tests. Methods: We used DenseNet121 and visualization techniques such as gradient-weighted class activation mapping and local interpretable model-agnostic explanations to demonstrate the decision-making process of the model. We analyzed 36,142 CXR images of 4492 patients with PTB obtained from Severance Hospital, focusing specifically on the lung region through segmentation and cropping with TransUNet. We used data from 2004 to 2020 to train the model, data from 2021 for testing, and data from 2022 to 2023 for internal validation. In addition, we used 1978 CXR images of 299 patients with PTB obtained from Yongin Severance Hospital for external validation. Results: In the internal validation, the model achieved an accuracy of 73.27%, an area under the receiver operating characteristic curve of 0.79, and an area under the precision-recall curve of 0.77. In the external validation, it exhibited an accuracy of 70.29%, an area under the receiver operating characteristic curve of 0.77, and an area under the precision-recall curve of 0.8. In addition, gradient-weighted class activation mapping and local interpretable model-agnostic explanations provided insights into the decision-making process of the AI model. Conclusions: This proposed AI tool offers a rapid and accurate alternative for evaluating PTB infectivity through CXR, with significant implications for enhancing screening efficiency by evaluating infectivity before sputum test results in clinical settings, compared with traditional smear and culture tests. %M 39509691 %R 10.2196/58413 %U https://www.jmir.org/2024/1/e58413 %U https://doi.org/10.2196/58413 %U http://www.ncbi.nlm.nih.gov/pubmed/39509691 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 12 %N %P e54246 %T A New Natural Language Processing–Inspired Methodology (Detection, Initial Characterization, and Semantic Characterization) to Investigate Temporal Shifts (Drifts) in Health Care Data: Quantitative Study %A Paiva,Bruno %A Gonçalves,Marcos André %A da Rocha,Leonardo Chaves Dutra %A Marcolino,Milena Soriano %A Lana,Fernanda Cristina Barbosa %A Souza-Silva,Maira Viana Rego %A Almeida,Jussara M %A Pereira,Polianna Delfino %A de Andrade,Claudio Moisés Valiense %A Gomes,Angélica Gomides dos Reis %A Ferreira,Maria Angélica Pires %A Bartolazzi,Frederico %A Sacioto,Manuela Furtado %A Boscato,Ana Paula %A Guimarães-Júnior,Milton Henriques %A dos Reis,Priscilla Pereira %A Costa,Felício Roberto %A Jorge,Alzira de Oliveira %A Coelho,Laryssa Reis %A Carneiro,Marcelo %A Sales,Thaís Lorenna Souza %A Araújo,Silvia Ferreira %A Silveira,Daniel Vitório %A Ruschel,Karen Brasil %A Santos,Fernanda Caldeira Veloso %A Cenci,Evelin Paola de Almeida %A Menezes,Luanna Silva Monteiro %A Anschau,Fernando %A Bicalho,Maria Aparecida Camargos %A Manenti,Euler Roberto Fernandes %A Finger,Renan Goulart %A Ponce,Daniela %A de Aguiar,Filipe Carrilho %A Marques,Luiza Margoto %A de Castro,Luís César %A Vietta,Giovanna Grünewald %A Godoy,Mariana Frizzo de %A Vilaça,Mariana do Nascimento %A Morais,Vivian Costa %+ Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil, Street Daniel de Carvalho, 1846, apto 201, Belo Horizonte, 30431310, Brazil, 55 31999710134, angelfire7@gmail.com %K health care %K machine learning %K data drifts %K temporal drifts %D 2024 %7 28.10.2024 %9 Original Paper %J JMIR Med Inform %G English %X Background: Proper analysis and interpretation of health care data can significantly improve patient outcomes by enhancing services and revealing the impacts of new technologies and treatments. Understanding the substantial impact of temporal shifts in these data is crucial. For example, COVID-19 vaccination initially lowered the mean age of at-risk patients and later changed the characteristics of those who died. This highlights the importance of understanding these shifts for assessing factors that affect patient outcomes. Objective: This study aims to propose detection, initial characterization, and semantic characterization (DIS), a new methodology for analyzing changes in health outcomes and variables over time while discovering contextual changes for outcomes in large volumes of data. Methods: The DIS methodology involves 3 steps: detection, initial characterization, and semantic characterization. Detection uses metrics such as Jensen-Shannon divergence to identify significant data drifts. Initial characterization offers a global analysis of changes in data distribution and predictive feature significance over time. Semantic characterization uses natural language processing–inspired techniques to understand the local context of these changes, helping identify factors driving changes in patient outcomes. By integrating the outcomes from these 3 steps, our results can identify specific factors (eg, interventions and modifications in health care practices) that drive changes in patient outcomes. DIS was applied to the Brazilian COVID-19 Registry and the Medical Information Mart for Intensive Care, version IV (MIMIC-IV) data sets. Results: Our approach allowed us to (1) identify drifts effectively, especially using metrics such as the Jensen-Shannon divergence, and (2) uncover reasons for the decline in overall mortality in both the COVID-19 and MIMIC-IV data sets, as well as changes in the cooccurrence between different diseases and this particular outcome. Factors such as vaccination during the COVID-19 pandemic and reduced iatrogenic events and cancer-related deaths in MIMIC-IV were highlighted. The methodology also pinpointed shifts in patient demographics and disease patterns, providing insights into the evolving health care landscape during the study period. Conclusions: We developed a novel methodology combining machine learning and natural language processing techniques to detect, characterize, and understand temporal shifts in health care data. This understanding can enhance predictive algorithms, improve patient outcomes, and optimize health care resource allocation, ultimately improving the effectiveness of machine learning predictive algorithms applied to health care data. Our methodology can be applied to a variety of scenarios beyond those discussed in this paper. %M 39467275 %R 10.2196/54246 %U https://medinform.jmir.org/2024/1/e54246 %U https://doi.org/10.2196/54246 %U http://www.ncbi.nlm.nih.gov/pubmed/39467275 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e58358 %T AI Governance: A Challenge for Public Health %A Wagner,Jennifer K %A Doerr,Megan %A Schmit,Cason D %K artificial intelligence %K legislation and jurisprudence %K harm reduction %K social determinants of health %K one health %K AI %K invisible algorithms %K modern life %K public health %K engagement %K AI governance %K traditional regulation %K soft law %D 2024 %7 30.9.2024 %9 %J JMIR Public Health Surveill %G English %X The rapid evolution of artificial intelligence (AI) is structuralizing social, political, and economic determinants of health into the invisible algorithms that shape all facets of modern life. Nevertheless, AI holds immense potential as a public health tool, enabling beneficial objectives such as precision public health and medicine. Developing an AI governance framework that can maximize the benefits and minimize the risks of AI is a significant challenge. The benefits of public health engagement in AI governance could be extensive. Here, we describe how several public health concepts can enhance AI governance. Specifically, we explain how (1) harm reduction can provide a framework for navigating the governance debate between traditional regulation and “soft law” approaches; (2) a public health understanding of social determinants of health is crucial to optimally weigh the potential risks and benefits of AI; (3) public health ethics provides a toolset for guiding governance decisions where individual interests intersect with collective interests; and (4) a One Health approach can improve AI governance effectiveness while advancing public health outcomes. Public health theories, perspectives, and innovations could substantially enrich and improve AI governance, creating a more equitable and socially beneficial path for AI development. %R 10.2196/58358 %U https://publichealth.jmir.org/2024/1/e58358 %U https://doi.org/10.2196/58358 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e57437 %T Personality and Health-Related Quality of Life of Older Chinese Adults: Cross-Sectional Study and Moderated Mediation Model Analysis %A Dong,Xing-Xuan %A Huang,Yueqing %A Miao,Yi-Fan %A Hu,Hui-Hui %A Pan,Chen-Wei %A Zhang,Tianyang %A Wu,Yibo %K personality %K health-related quality of life %K older adults %K sleep quality %K quality of life %K old %K older %K Chinese %K China %K mechanisms %K psychology %K behavior %K analysis %K hypothesis %K neuroticism %K mediation analysis %K health care providers %K aging %D 2024 %7 12.9.2024 %9 %J JMIR Public Health Surveill %G English %X Background: Personality has an impact on the health-related quality of life (HRQoL) of older adults. However, the relationship and mechanisms of the 2 variables are controversial, and few studies have been conducted on older adults. Objective: The aim of this study was to explore the relationship between personality and HRQoL and the mediating and moderating roles of sleep quality and place of residence in this relationship. Methods: A total of 4123 adults 60 years and older were from the Psychology and Behavior Investigation of Chinese Residents survey. Participants were asked to complete the Big Five Inventory, the Brief version of the Pittsburgh Sleep Quality Index, and EQ-5D-5L. A backpropagation neural network was used to explore the order of factors contributing to HRQoL. Path analysis was performed to evaluate the mediation hypothesis. Results: As of August 31, 2022, we enrolled 4123 older adults 60 years and older. Neuroticism and extraversion were strong influencing factors of HRQoL (normalized importance >50%). The results of the mediation analysis suggested that neuroticism and extraversion may enhance and diminish, respectively, HRQoL (index: β=−.262, P<.001; visual analog scale: β=−.193, P<.001) by increasing and decreasing brief version of the Pittsburgh Sleep Quality Index scores (neuroticism: β=.17, P<.001; extraversion: β=−.069, P<.001). The multigroup analysis suggested a significant moderating effect of the place of residence (EQ-5D-5L index: P<.001; EQ-5D-5L visual analog scale: P<.001). No significant direct effect was observed between extraversion and EQ-5D-5L index in urban older residents (β=.037, P=.73). Conclusions: This study sheds light on the potential mechanisms of personality and HRQoL among older Chinese adults and can help health care providers and relevant departments take reasonable measures to promote healthy aging. %R 10.2196/57437 %U https://publichealth.jmir.org/2024/1/e57437 %U https://doi.org/10.2196/57437 %0 Journal Article %@ 2563-6316 %I JMIR Publications %V 5 %N %P e56993 %T Machine Learning–Based Hyperglycemia Prediction: Enhancing Risk Assessment in a Cohort of Undiagnosed Individuals %A Oyebola,Kolapo %A Ligali,Funmilayo %A Owoloye,Afolabi %A Erinwusi,Blessing %A Alo,Yetunde %A Musa,Adesola Z %A Aina,Oluwagbemiga %A Salako,Babatunde %K hyperglycemia %K diabetes %K machine learning %K hypertension %K random forest %D 2024 %7 11.9.2024 %9 %J JMIRx Med %G English %X Background: Noncommunicable diseases continue to pose a substantial health challenge globally, with hyperglycemia serving as a prominent indicator of diabetes. Objective: This study employed machine learning algorithms to predict hyperglycemia in a cohort of individuals who were asymptomatic and unraveled crucial predictors contributing to early risk identification. Methods: This dataset included an extensive array of clinical and demographic data obtained from 195 adults who were asymptomatic and residing in a suburban community in Nigeria. The study conducted a thorough comparison of multiple machine learning algorithms to ascertain the most effective model for predicting hyperglycemia. Moreover, we explored feature importance to pinpoint correlates of high blood glucose levels within the cohort. Results: Elevated blood pressure and prehypertension were recorded in 8 (4.1%) and 18 (9.2%) of the 195 participants, respectively. A total of 41 (21%) participants presented with hypertension, of which 34 (83%) were female. However, sex adjustment showed that 34 of 118 (28.8%) female participants and 7 of 77 (9%) male participants had hypertension. Age-based analysis revealed an inverse relationship between normotension and age (r=−0.88; P=.02). Conversely, hypertension increased with age (r=0.53; P=.27), peaking between 50‐59 years. Of the 195 participants, isolated systolic hypertension and isolated diastolic hypertension were recorded in 16 (8.2%) and 15 (7.7%) participants, respectively, with female participants recording a higher prevalence of isolated systolic hypertension (11/16, 69%) and male participants reporting a higher prevalence of isolated diastolic hypertension (11/15, 73%). Following class rebalancing, the random forest classifier gave the best performance (accuracy score 0.89; receiver operating characteristic–area under the curve score 0.89; F1-score 0.89) of the 26 model classifiers. The feature selection model identified uric acid and age as important variables associated with hyperglycemia. Conclusions: The random forest classifier identified significant clinical correlates associated with hyperglycemia, offering valuable insights for the early detection of diabetes and informing the design and deployment of therapeutic interventions. However, to achieve a more comprehensive understanding of each feature’s contribution to blood glucose levels, modeling additional relevant clinical features in larger datasets could be beneficial. %R 10.2196/56993 %U https://xmed.jmir.org/2024/1/e56993 %U https://doi.org/10.2196/56993 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e60501 %T Prompt Engineering Paradigms for Medical Applications: Scoping Review %A Zaghir,Jamil %A Naguib,Marco %A Bjelogrlic,Mina %A Névéol,Aurélie %A Tannier,Xavier %A Lovis,Christian %+ Department of Radiology and Medical Informatics, University of Geneva, Chemin des Mines, 9, Geneva, 1202, Switzerland, 41 022 379 08 18, Jamil.Zaghir@unige.ch %K prompt engineering %K prompt design %K prompt learning %K prompt tuning %K large language models %K LLMs %K scoping review %K clinical natural language processing %K natural language processing %K NLP %K medical texts %K medical application %K medical applications %K clinical practice %K privacy %K medicine %K computer science %K medical informatics %D 2024 %7 10.9.2024 %9 Review %J J Med Internet Res %G English %X Background: Prompt engineering, focusing on crafting effective prompts to large language models (LLMs), has garnered attention for its capabilities at harnessing the potential of LLMs. This is even more crucial in the medical domain due to its specialized terminology and language technicity. Clinical natural language processing applications must navigate complex language and ensure privacy compliance. Prompt engineering offers a novel approach by designing tailored prompts to guide models in exploiting clinically relevant information from complex medical texts. Despite its promise, the efficacy of prompt engineering in the medical domain remains to be fully explored. Objective: The aim of the study is to review research efforts and technical approaches in prompt engineering for medical applications as well as provide an overview of opportunities and challenges for clinical practice. Methods: Databases indexing the fields of medicine, computer science, and medical informatics were queried in order to identify relevant published papers. Since prompt engineering is an emerging field, preprint databases were also considered. Multiple data were extracted, such as the prompt paradigm, the involved LLMs, the languages of the study, the domain of the topic, the baselines, and several learning, design, and architecture strategies specific to prompt engineering. We include studies that apply prompt engineering–based methods to the medical domain, published between 2022 and 2024, and covering multiple prompt paradigms such as prompt learning (PL), prompt tuning (PT), and prompt design (PD). Results: We included 114 recent prompt engineering studies. Among the 3 prompt paradigms, we have observed that PD is the most prevalent (78 papers). In 12 papers, PD, PL, and PT terms were used interchangeably. While ChatGPT is the most commonly used LLM, we have identified 7 studies using this LLM on a sensitive clinical data set. Chain-of-thought, present in 17 studies, emerges as the most frequent PD technique. While PL and PT papers typically provide a baseline for evaluating prompt-based approaches, 61% (48/78) of the PD studies do not report any nonprompt-related baseline. Finally, we individually examine each of the key prompt engineering–specific information reported across papers and find that many studies neglect to explicitly mention them, posing a challenge for advancing prompt engineering research. Conclusions: In addition to reporting on trends and the scientific landscape of prompt engineering, we provide reporting guidelines for future studies to help advance research in the medical field. We also disclose tables and figures summarizing medical prompt engineering papers available and hope that future contributions will leverage these existing works to better advance the field. %M 39255030 %R 10.2196/60501 %U https://www.jmir.org/2024/1/e60501 %U https://doi.org/10.2196/60501 %U http://www.ncbi.nlm.nih.gov/pubmed/39255030 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e53562 %T Automated Behavioral Coding to Enhance the Effectiveness of Motivational Interviewing in a Chat-Based Suicide Prevention Helpline: Secondary Analysis of a Clinical Trial %A Pellemans,Mathijs %A Salmi,Salim %A Mérelle,Saskia %A Janssen,Wilco %A van der Mei,Rob %+ Department of Mathematics, Vrije Universiteit Amsterdam, De Boelelaan 1111, Amsterdam, 1081 HV, Netherlands, 31 20 5987700, m.j.pellemans@vu.nl %K motivational interviewing %K behavioral coding %K suicide prevention %K artificial intelligence %K effectiveness %K counseling %K support tool %K online help %K mental health %D 2024 %7 1.8.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: With the rise of computer science and artificial intelligence, analyzing large data sets promises enormous potential in gaining insights for developing and improving evidence-based health interventions. One such intervention is the counseling strategy motivational interviewing (MI), which has been found effective in improving a wide range of health-related behaviors. Despite the simplicity of its principles, MI can be a challenging skill to learn and requires expertise to apply effectively. Objective: This study aims to investigate the performance of artificial intelligence models in classifying MI behavior and explore the feasibility of using these models in online helplines for mental health as an automated support tool for counselors in clinical practice. Methods: We used a coded data set of 253 MI counseling chat sessions from the 113 Suicide Prevention helpline. With 23,982 messages coded with the MI Sequential Code for Observing Process Exchanges codebook, we trained and evaluated 4 machine learning models and 1 deep learning model to classify client- and counselor MI behavior based on language use. Results: The deep learning model BERTje outperformed all machine learning models, accurately predicting counselor behavior (accuracy=0.72, area under the curve [AUC]=0.95, Cohen κ=0.69). It differentiated MI congruent and incongruent counselor behavior (AUC=0.92, κ=0.65) and evocative and nonevocative language (AUC=0.92, κ=0.66). For client behavior, the model achieved an accuracy of 0.70 (AUC=0.89, κ=0.55). The model’s interpretable predictions discerned client change talk and sustain talk, counselor affirmations, and reflection types, facilitating valuable counselor feedback. Conclusions: The results of this study demonstrate that artificial intelligence techniques can accurately classify MI behavior, indicating their potential as a valuable tool for enhancing MI proficiency in online helplines for mental health. Provided that the data set size is sufficiently large with enough training samples for each behavioral code, these methods can be trained and applied to other domains and languages, offering a scalable and cost-effective way to evaluate MI adherence, accelerate behavioral coding, and provide therapists with personalized, quick, and objective feedback. %M 39088244 %R 10.2196/53562 %U https://www.jmir.org/2024/1/e53562 %U https://doi.org/10.2196/53562 %U http://www.ncbi.nlm.nih.gov/pubmed/39088244 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e56930 %T Roles, Users, Benefits, and Limitations of Chatbots in Health Care: Rapid Review %A Laymouna,Moustafa %A Ma,Yuanchao %A Lessard,David %A Schuster,Tibor %A Engler,Kim %A Lebouché,Bertrand %+ Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, D02.4110 – Glen Site, 1001 Decarie Blvd, Montreal, QC, H4A 3J1, Canada, 1 514 843 2090, bertrand.lebouche@mcgill.ca %K chatbot %K conversational agent %K conversational assistant %K user-computer interface %K digital health %K mobile health %K electronic health %K telehealth %K artificial intelligence %K AI %K health information technology %D 2024 %7 23.7.2024 %9 Review %J J Med Internet Res %G English %X Background: Chatbots, or conversational agents, have emerged as significant tools in health care, driven by advancements in artificial intelligence and digital technology. These programs are designed to simulate human conversations, addressing various health care needs. However, no comprehensive synthesis of health care chatbots’ roles, users, benefits, and limitations is available to inform future research and application in the field. Objective: This review aims to describe health care chatbots’ characteristics, focusing on their diverse roles in the health care pathway, user groups, benefits, and limitations. Methods: A rapid review of published literature from 2017 to 2023 was performed with a search strategy developed in collaboration with a health sciences librarian and implemented in the MEDLINE and Embase databases. Primary research studies reporting on chatbot roles or benefits in health care were included. Two reviewers dual-screened the search results. Extracted data on chatbot roles, users, benefits, and limitations were subjected to content analysis. Results: The review categorized chatbot roles into 2 themes: delivery of remote health services, including patient support, care management, education, skills building, and health behavior promotion, and provision of administrative assistance to health care providers. User groups spanned across patients with chronic conditions as well as patients with cancer; individuals focused on lifestyle improvements; and various demographic groups such as women, families, and older adults. Professionals and students in health care also emerged as significant users, alongside groups seeking mental health support, behavioral change, and educational enhancement. The benefits of health care chatbots were also classified into 2 themes: improvement of health care quality and efficiency and cost-effectiveness in health care delivery. The identified limitations encompassed ethical challenges, medicolegal and safety concerns, technical difficulties, user experience issues, and societal and economic impacts. Conclusions: Health care chatbots offer a wide spectrum of applications, potentially impacting various aspects of health care. While they are promising tools for improving health care efficiency and quality, their integration into the health care system must be approached with consideration of their limitations to ensure optimal, safe, and equitable use. %M 39042446 %R 10.2196/56930 %U https://www.jmir.org/2024/1/e56930 %U https://doi.org/10.2196/56930 %U http://www.ncbi.nlm.nih.gov/pubmed/39042446 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e58158 %T Evaluating and Enhancing Large Language Models’ Performance in Domain-Specific Medicine: Development and Usability Study With DocOA %A Chen,Xi %A Wang,Li %A You,MingKe %A Liu,WeiZhi %A Fu,Yu %A Xu,Jie %A Zhang,Shaoting %A Chen,Gang %A Li,Kang %A Li,Jian %+ Sports Medicine Center, West China Hospital, Sichuan University, No. 37, Guoxue Alley, Wuhou District, Chengdu, 610041, China, 86 18980601388, lijian_sportsmed@163.com %K large language model %K retrieval-augmented generation %K domain-specific benchmark framework %K osteoarthritis management %D 2024 %7 22.7.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: The efficacy of large language models (LLMs) in domain-specific medicine, particularly for managing complex diseases such as osteoarthritis (OA), remains largely unexplored. Objective: This study focused on evaluating and enhancing the clinical capabilities and explainability of LLMs in specific domains, using OA management as a case study. Methods: A domain-specific benchmark framework was developed to evaluate LLMs across a spectrum from domain-specific knowledge to clinical applications in real-world clinical scenarios. DocOA, a specialized LLM designed for OA management integrating retrieval-augmented generation and instructional prompts, was developed. It can identify the clinical evidence upon which its answers are based through retrieval-augmented generation, thereby demonstrating the explainability of those answers. The study compared the performance of GPT-3.5, GPT-4, and a specialized assistant, DocOA, using objective and human evaluations. Results: Results showed that general LLMs such as GPT-3.5 and GPT-4 were less effective in the specialized domain of OA management, particularly in providing personalized treatment recommendations. However, DocOA showed significant improvements. Conclusions: This study introduces a novel benchmark framework that assesses the domain-specific abilities of LLMs in multiple aspects, highlights the limitations of generalized LLMs in clinical contexts, and demonstrates the potential of tailored approaches for developing domain-specific medical LLMs. %M 38833165 %R 10.2196/58158 %U https://www.jmir.org/2024/1/e58158 %U https://doi.org/10.2196/58158 %U http://www.ncbi.nlm.nih.gov/pubmed/38833165 %0 Journal Article %@ 2369-1999 %I JMIR Publications %V 10 %N %P e43070 %T Artificial Intelligence–Based Co-Facilitator (AICF) for Detecting and Monitoring Group Cohesion Outcomes in Web-Based Cancer Support Groups: Single-Arm Trial Study %A Leung,Yvonne W %A Wouterloot,Elise %A Adikari,Achini %A Hong,Jinny %A Asokan,Veenaajaa %A Duan,Lauren %A Lam,Claire %A Kim,Carlina %A Chan,Kai P %A De Silva,Daswin %A Trachtenberg,Lianne %A Rennie,Heather %A Wong,Jiahui %A Esplen,Mary Jane %+ de Souza Institute, University Health Network, de Souza Institute c/o Toronto General Hospital, 200 Elizabeth St RFE 3-440, Toronto, ON, M5G 2C4, Canada, 1 647 299 1360, yw.leung@utoronto.ca %K group cohesion %K LIWC %K online support group %K natural language processing %K NLP %K emotion analysis %K machine learning %K sentiment analysis %K emotion detection %K integrating human knowledge %K emotion lining %K cancer %K oncology %K support group %K artificial intelligence %K AI %K therapy %K online therapist %K emotion %K affect %K speech tagging %K speech tag %K topic modeling %K named entity recognition %K spoken language processing %K focus group %K corpus %K language %K linguistic %D 2024 %7 22.7.2024 %9 Original Paper %J JMIR Cancer %G English %X Background: Commonly offered as supportive care, therapist-led online support groups (OSGs) are a cost-effective way to provide support to individuals affected by cancer. One important indicator of a successful OSG session is group cohesion; however, monitoring group cohesion can be challenging due to the lack of nonverbal cues and in-person interactions in text-based OSGs. The Artificial Intelligence–based Co-Facilitator (AICF) was designed to contextually identify therapeutic outcomes from conversations and produce real-time analytics. Objective: The aim of this study was to develop a method to train and evaluate AICF’s capacity to monitor group cohesion. Methods: AICF used a text classification approach to extract the mentions of group cohesion within conversations. A sample of data was annotated by human scorers, which was used as the training data to build the classification model. The annotations were further supported by finding contextually similar group cohesion expressions using word embedding models as well. AICF performance was also compared against the natural language processing software Linguistic Inquiry Word Count (LIWC). Results: AICF was trained on 80,000 messages obtained from Cancer Chat Canada. We tested AICF on 34,048 messages. Human experts scored 6797 (20%) of the messages to evaluate the ability of AICF to classify group cohesion. Results showed that machine learning algorithms combined with human input could detect group cohesion, a clinically meaningful indicator of effective OSGs. After retraining with human input, AICF reached an F1-score of 0.82. AICF performed slightly better at identifying group cohesion compared to LIWC. Conclusions: AICF has the potential to assist therapists by detecting discord in the group amenable to real-time intervention. Overall, AICF presents a unique opportunity to strengthen patient-centered care in web-based settings by attending to individual needs. International Registered Report Identifier (IRRID): RR2-10.2196/21453 %M 39037754 %R 10.2196/43070 %U https://cancer.jmir.org/2024/1/e43070 %U https://doi.org/10.2196/43070 %U http://www.ncbi.nlm.nih.gov/pubmed/39037754 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e51327 %T Public Perceptions and Discussions of the US Food and Drug Administration's JUUL Ban Policy on Twitter: Observational Study %A Liu,Pinxin %A Lou,Xubin %A Xie,Zidian %A Shang,Ce %A Li,Dongmei %+ Department of Clinical and Translational Research, University of Rochester Medical Center, 265 Crittenden Boulevard CU 420708, Rochester, NY, 14642-0708, United States, 1 5852767285, Dongmei_Li@urmc.rochester.edu %K e-cigarettes %K JUUL %K Twitter %K deep learning %K FDA %K Food and Drug Administration %K vape %K vaping %K smoking %K social media %K regulation %D 2024 %7 11.7.2024 %9 Original Paper %J JMIR Form Res %G English %X Background: On June 23, 2022, the US Food and Drug Administration announced a JUUL ban policy, to ban all vaping and electronic cigarette products sold by Juul Labs. Objective: This study aims to understand public perceptions and discussions of this policy using Twitter (subsequently rebranded as X) data. Methods: Using the Twitter streaming application programming interface, 17,007 tweets potentially related to the JUUL ban policy were collected between June 22, 2022, and July 25, 2022. Based on 2600 hand-coded tweets, a deep learning model (RoBERTa) was trained to classify all tweets into propolicy, antipolicy, neutral, and irrelevant categories. A deep learning model (M3 model) was used to estimate basic demographics (such as age and gender) of Twitter users. Furthermore, major topics were identified using latent Dirichlet allocation modeling. A logistic regression model was used to examine the association of different Twitter users with their attitudes toward the policy. Results: Among 10,480 tweets related to the JUUL ban policy, there were similar proportions of propolicy and antipolicy tweets (n=2777, 26.5% vs n=2666, 25.44%). Major propolicy topics included “JUUL causes youth addition,” “market surge of JUUL,” and “health effects of JUUL.” In contrast, major antipolicy topics included “cigarette should be banned instead of JUUL,” “against the irrational policy,” and “emotional catharsis.” Twitter users older than 29 years were more likely to be propolicy (have a positive attitude toward the JUUL ban policy) than those younger than 29 years. Conclusions: Our study showed that the public showed different responses to the JUUL ban policy, which varies depending on the demographic characteristics of Twitter users. Our findings could provide valuable information to the Food and Drug Administration for future electronic cigarette and other tobacco product regulations. %M 38990633 %R 10.2196/51327 %U https://formative.jmir.org/2024/1/e51327 %U https://doi.org/10.2196/51327 %U http://www.ncbi.nlm.nih.gov/pubmed/38990633 %0 Journal Article %@ 2373-6658 %I JMIR Publications %V 8 %N %P e48378 %T A Random Forest Algorithm for Assessing Risk Factors Associated With Chronic Kidney Disease: Observational Study %A Liu,Pei %A Liu,Yijun %A Liu,Hao %A Xiong,Linping %A Mei,Changlin %A Yuan,Lei %+ Department of Health Management, Second Military Medical University, No.800 Xiangyin Road, Yangpu District, Shanghai, China, Shanghai, 200433, China, 86 15026929271, yuanleigz@163.com %K chronic kidney disease %K random forest model %K risk factors %K assessment %D 2024 %7 3.6.2024 %9 Original Paper %J Asian Pac Isl Nurs J %G English %X Background: The prevalence and mortality rate of chronic kidney disease (CKD) are increasing year by year, and it has become a global public health issue. The economic burden caused by CKD is increasing at a rate of 1% per year. CKD is highly prevalent and its treatment cost is high but unfortunately remains unknown. Therefore, early detection and intervention are vital means to mitigate the treatment burden on patients and decrease disease progression. Objective: In this study, we investigated the advantages of using the random forest (RF) algorithm for assessing risk factors associated with CKD. Methods: We included 40,686 people with complete screening records who underwent screening between January 1, 2015, and December 22, 2020, in Jing’an District, Shanghai, China. We grouped the participants into those with and those without CKD by staging based on the glomerular filtration rate staging and grouping based on albuminuria. Using a logistic regression model, we determined the relationship between CKD and risk factors. The RF machine learning algorithm was used to score the predictive variables and rank them based on their importance to construct a prediction model. Results: The logistic regression model revealed that gender, older age, obesity, abnormal index estimated glomerular filtration rate, retirement status, and participation in urban employee medical insurance were significantly associated with the risk of CKD. On RF algorithm–based screening, the top 4 factors influencing CKD were age, albuminuria, working status, and urinary albumin-creatinine ratio. The RF model predicted an area under the receiver operating characteristic curve of 93.15%. Conclusions: Our findings reveal that the RF algorithm has significant predictive value for assessing risk factors associated with CKD and allows the screening of individuals with risk factors. This has crucial implications for early intervention and prevention of CKD. %M 38830204 %R 10.2196/48378 %U https://apinj.jmir.org/2024/1/e48378 %U https://doi.org/10.2196/48378 %U http://www.ncbi.nlm.nih.gov/pubmed/38830204 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e52691 %T Exploring the Association Between Structural Racism and Mental Health: Geospatial and Machine Learning Analysis %A Mohebbi,Fahimeh %A Forati,Amir Masoud %A Torres,Lucas %A deRoon-Cassini,Terri A %A Harris,Jennifer %A Tomas,Carissa W %A Mantsch,John R %A Ghose,Rina %+ Department of Pharmacology & Toxicology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI, 53226, United States, 1 4149558861, jomantsch@mcw.edu %K machine learning %K geospatial %K racial disparities %K social determinant of health %K structural racism %K mental health %K health disparities %K deep learning %D 2024 %7 3.5.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Structural racism produces mental health disparities. While studies have examined the impact of individual factors such as poverty and education, the collective contribution of these elements, as manifestations of structural racism, has been less explored. Milwaukee County, Wisconsin, with its racial and socioeconomic diversity, provides a unique context for this multifactorial investigation. Objective: This research aimed to delineate the association between structural racism and mental health disparities in Milwaukee County, using a combination of geospatial and deep learning techniques. We used secondary data sets where all data were aggregated and anonymized before being released by federal agencies. Methods: We compiled 217 georeferenced explanatory variables across domains, initially deliberately excluding race-based factors to focus on nonracial determinants. This approach was designed to reveal the underlying patterns of risk factors contributing to poor mental health, subsequently reintegrating race to assess the effects of racism quantitatively. The variable selection combined tree-based methods (random forest) and conventional techniques, supported by variance inflation factor and Pearson correlation analysis for multicollinearity mitigation. The geographically weighted random forest model was used to investigate spatial heterogeneity and dependence. Self-organizing maps, combined with K-means clustering, were used to analyze data from Milwaukee communities, focusing on quantifying the impact of structural racism on the prevalence of poor mental health. Results: While 12 influential factors collectively accounted for 95.11% of the variability in mental health across communities, the top 6 factors—smoking, poverty, insufficient sleep, lack of health insurance, employment, and age—were particularly impactful. Predominantly, African American neighborhoods were disproportionately affected, which is 2.23 times more likely to encounter high-risk clusters for poor mental health. Conclusions: The findings demonstrate that structural racism shapes mental health disparities, with Black community members disproportionately impacted. The multifaceted methodological approach underscores the value of integrating geospatial analysis and deep learning to understand complex social determinants of mental health. These insights highlight the need for targeted interventions, addressing both individual and systemic factors to mitigate mental health disparities rooted in structural racism. %M 38701436 %R 10.2196/52691 %U https://publichealth.jmir.org/2024/1/e52691 %U https://doi.org/10.2196/52691 %U http://www.ncbi.nlm.nih.gov/pubmed/38701436 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 16 %N %P e50201 %T Applying Machine Learning Techniques to Implementation Science %A Huguet,Nathalie %A Chen,Jinying %A Parikh,Ravi B %A Marino,Miguel %A Flocke,Susan A %A Likumahuwa-Ackman,Sonja %A Bekelman,Justin %A DeVoe,Jennifer E %+ Department of Family Medicine, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, United States, 1 503 494 4404, huguetn@ohsu.edu %K implementation science %K machine learning %K implementation strategies %K techniques %K implementation %K prediction %K adaptation %K acceptance %K challenges %K scientist %D 2024 %7 22.4.2024 %9 Viewpoint %J Online J Public Health Inform %G English %X Machine learning (ML) approaches could expand the usefulness and application of implementation science methods in clinical medicine and public health settings. The aim of this viewpoint is to introduce a roadmap for applying ML techniques to address implementation science questions, such as predicting what will work best, for whom, under what circumstances, and with what predicted level of support, and what and when adaptation or deimplementation are needed. We describe how ML approaches could be used and discuss challenges that implementation scientists and methodologists will need to consider when using ML throughout the stages of implementation. %M 38648094 %R 10.2196/50201 %U https://ojphi.jmir.org/2024/1/e50201 %U https://doi.org/10.2196/50201 %U http://www.ncbi.nlm.nih.gov/pubmed/38648094 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 16 %N %P e50771 %T Machine Learning for Prediction of Tuberculosis Detection: Case Study of Trained African Giant Pouched Rats %A Jonathan,Joan %A Barakabitze,Alcardo Alex %A Fast,Cynthia D %A Cox,Christophe %+ Department of Informatics and Information Technology, Sokoine University of Agriculture, PO Box 3038, Morogoro, United Republic of Tanzania, 255 763 630 054, joanjonathan@sua.ac.tz %K machine learning %K African giant pouched rat %K diagnosis %K tuberculosis %K health care %D 2024 %7 16.4.2024 %9 Original Paper %J Online J Public Health Inform %G English %X Background: Technological advancement has led to the growth and rapid increase of tuberculosis (TB) medical data generated from different health care areas, including diagnosis. Prioritizing better adoption and acceptance of innovative diagnostic technology to reduce the spread of TB significantly benefits developing countries. Trained TB-detection rats are used in Tanzania and Ethiopia for operational research to complement other TB diagnostic tools. This technology has increased new TB case detection owing to its speed, cost-effectiveness, and sensitivity. Objective: During the TB detection process, rats produce vast amounts of data, providing an opportunity to identify interesting patterns that influence TB detection performance. This study aimed to develop models that predict if the rat will hit (indicate the presence of TB within) the sample or not using machine learning (ML) techniques. The goal was to improve the diagnostic accuracy and performance of TB detection involving rats. Methods: APOPO (Anti-Persoonsmijnen Ontmijnende Product Ontwikkeling) Center in Morogoro provided data for this study from 2012 to 2019, and 366,441 observations were used to build predictive models using ML techniques, including decision tree, random forest, naïve Bayes, support vector machine, and k-nearest neighbor, by incorporating a variety of variables, such as the diagnostic results from partner health clinics using methods endorsed by the World Health Organization (WHO). Results: The support vector machine technique yielded the highest accuracy of 83.39% for prediction compared to other ML techniques used. Furthermore, this study found that the inclusion of variables related to whether the sample contained TB or not increased the performance accuracy of the predictive model. Conclusions: The inclusion of variables related to the diagnostic results of TB samples may improve the detection performance of the trained rats. The study results may be of importance to TB-detection rat trainers and TB decision-makers as the results may prompt them to take action to maintain the usefulness of the technology and increase the TB detection performance of trained rats. %M 38625737 %R 10.2196/50771 %U https://ojphi.jmir.org/2024/1/e50771 %U https://doi.org/10.2196/50771 %U http://www.ncbi.nlm.nih.gov/pubmed/38625737 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 15 %N %P e52782 %T Machine Learning Model for Predicting Mortality Risk in Patients With Complex Chronic Conditions: Retrospective Analysis %A Hernández Guillamet,Guillem %A Morancho Pallaruelo,Ariadna Ning %A Miró Mezquita,Laura %A Miralles,Ramón %A Mas,Miquel Àngel %A Ulldemolins Papaseit,María José %A Estrada Cuxart,Oriol %A López Seguí,Francesc %+ Chair in ICT and Health, Centre for Health and Social Care Research (CESS), University of Vic - Central University of Catalonia (UVic-UCC), Carrer Miquel Martí i Pol, 1, Vic, 08500, Spain, 1 938863342, francesc.lopez.segui@gmail.com %K machine learning %K mortality prediction %K chronicity %K chromic %K complex %K artificial intelligence %K complexity %K health data %K predict %K prediction %K predictive %K mortality %K death %K classification %K algorithm %K algorithms %K mortality risk %K risk prediction %D 2023 %7 28.12.2023 %9 Original Paper %J Online J Public Health Inform %G English %X Background: The health care system is undergoing a shift toward a more patient-centered approach for individuals with chronic and complex conditions, which presents a series of challenges, such as predicting hospital needs and optimizing resources. At the same time, the exponential increase in health data availability has made it possible to apply advanced statistics and artificial intelligence techniques to develop decision-support systems and improve resource planning, diagnosis, and patient screening. These methods are key to automating the analysis of large volumes of medical data and reducing professional workloads. Objective: This article aims to present a machine learning model and a case study in a cohort of patients with highly complex conditions. The object was to predict mortality within the following 4 years and early mortality over 6 months following diagnosis. The method used easily accessible variables and health care resource utilization information. Methods: A classification algorithm was selected among 6 models implemented and evaluated using a stratified cross-validation strategy with k=10 and a 70/30 train-test split. The evaluation metrics used included accuracy, recall, precision, F1-score, and area under the receiver operating characteristic (AUROC) curve. Results: The model predicted patient death with an 87% accuracy, recall of 87%, precision of 82%, F1-score of 84%, and area under the curve (AUC) of 0.88 using the best model, the Extreme Gradient Boosting (XGBoost) classifier. The results were worse when predicting premature deaths (following 6 months) with an 83% accuracy (recall=55%, precision=64% F1-score=57%, and AUC=0.88) using the Gradient Boosting (GRBoost) classifier. Conclusions: This study showcases encouraging outcomes in forecasting mortality among patients with intricate and persistent health conditions. The employed variables are conveniently accessible, and the incorporation of health care resource utilization information of the patient, which has not been employed by current state-of-the-art approaches, displays promising predictive power. The proposed prediction model is designed to efficiently identify cases that need customized care and proactively anticipate the demand for critical resources by health care providers. %M 38223690 %R 10.2196/52782 %U https://ojphi.jmir.org/2023/1/e52782 %U https://doi.org/10.2196/52782 %U http://www.ncbi.nlm.nih.gov/pubmed/38223690 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e52091 %T The Impact of Generative Conversational Artificial Intelligence on the Lesbian, Gay, Bisexual, Transgender, and Queer Community: Scoping Review %A Bragazzi,Nicola Luigi %A Crapanzano,Andrea %A Converti,Manlio %A Zerbetto,Riccardo %A Khamisy-Farah,Rola %+ Laboratory for Industrial and Applied Mathematics, Department of Mathematics and Statistics, York University, 4700 Keele Street, Toronto, ON, M3J 1P3, Canada, 1 416 736 2100, robertobragazzi@gmail.com %K generative conversational artificial intelligence %K chatbot %K lesbian, gay, bisexual, transgender, and queer community %K LGBTQ %K scoping review %K mobile phone %D 2023 %7 6.12.2023 %9 Review %J J Med Internet Res %G English %X Background: Despite recent significant strides toward acceptance, inclusion, and equality, members of the lesbian, gay, bisexual, transgender, and queer (LGBTQ) community still face alarming mental health disparities, being almost 3 times more likely to experience depression, anxiety, and suicidal thoughts than their heterosexual counterparts. These unique psychological challenges are due to discrimination, stigmatization, and identity-related struggles and can potentially benefit from generative conversational artificial intelligence (AI). As the latest advancement in AI, conversational agents and chatbots can imitate human conversation and support mental health, fostering diversity and inclusivity, combating stigma, and countering discrimination. In contrast, if not properly designed, they can perpetuate exclusion and inequities. Objective: This study aims to examine the impact of generative conversational AI on the LGBTQ community. Methods: This study was designed as a scoping review. Four electronic scholarly databases (Scopus, Embase, Web of Science, and MEDLINE via PubMed) and gray literature (Google Scholar) were consulted from inception without any language restrictions. Original studies focusing on the LGBTQ community or counselors working with this community exposed to chatbots and AI-enhanced internet-based platforms and exploring the feasibility, acceptance, or effectiveness of AI-enhanced tools were deemed eligible. The findings were reported in accordance with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). Results: Seven applications (HIVST-Chatbot, TelePrEP Navigator, Amanda Selfie, Crisis Contact Simulator, REALbot, Tough Talks, and Queer AI) were included and reviewed. The chatbots and internet-based assistants identified served various purposes: (1) to identify LGBTQ individuals at risk of suicide or contracting HIV or other sexually transmitted infections, (2) to provide resources to LGBTQ youth from underserved areas, (3) facilitate HIV status disclosure to sex partners, and (4) develop training role-play personas encompassing the diverse experiences and intersecting identities of LGBTQ youth to educate counselors. The use of generative conversational AI for the LGBTQ community is still in its early stages. Initial studies have found that deploying chatbots is feasible and well received, with high ratings for usability and user satisfaction. However, there is room for improvement in terms of the content provided and making conversations more engaging and interactive. Many of these studies used small sample sizes and short-term interventions measuring limited outcomes. Conclusions: Generative conversational AI holds promise, but further development and formal evaluation are needed, including studies with larger samples, longer interventions, and randomized trials to compare different content, delivery methods, and dissemination platforms. In addition, a focus on engagement with behavioral objectives is essential to advance this field. The findings have broad practical implications, highlighting that AI’s impact spans various aspects of people’s lives. Assessing AI’s impact on diverse communities and adopting diversity-aware and intersectional approaches can help shape AI’s positive impact on society as a whole. %M 37864350 %R 10.2196/52091 %U https://www.jmir.org/2023/1/e52091 %U https://doi.org/10.2196/52091 %U http://www.ncbi.nlm.nih.gov/pubmed/37864350 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e46898 %T Application of Machine Learning Prediction of Individual SARS-CoV-2 Vaccination and Infection Status to the French Serosurveillance Survey From March 2020 to 2022: Cross-Sectional Study %A Bougeard,Stéphanie %A Huneau-Salaun,Adeline %A Attia,Mikael %A Richard,Jean-Baptiste %A Demeret,Caroline %A Platon,Johnny %A Allain,Virginie %A Le Vu,Stéphane %A Goyard,Sophie %A Gillon,Véronique %A Bernard-Stoecklin,Sibylle %A Crescenzo-Chaigne,Bernadette %A Jones,Gabrielle %A Rose,Nicolas %A van der Werf,Sylvie %A Lantz,Olivier %A Rose,Thierry %A Noël,Harold %+ Epidemiology, Health and Welfare, Laboratory of Ploufragan-Plouzané-Niort, French Agency for Food, Environmental, Occupational Health & Safety, BP 53 - Technopole Saint Brieuc Armor, Ploufragan, 22440, France, 33 296010150, stephanie.bougeard@anses.fr %K SARS-CoV-2 %K serological surveillance %K infection %K vaccination %K machine learning %K seroprevalence %K blood testing %K immunity %K survey %K vaccine response %K French population %K prediction %D 2023 %7 28.11.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The seroprevalence of SARS-CoV-2 infection in the French population was estimated with a representative, repeated cross-sectional survey based on residual sera from routine blood testing. These data contained no information on infection or vaccination status, thus limiting the ability to detail changes observed in the immunity level of the population over time. Objective: Our aim is to predict the infected or vaccinated status of individuals in the French serosurveillance survey based only on the results of serological assays. Reference data on longitudinal serological profiles of seronegative, infected, and vaccinated individuals from another French cohort were used to build the predictive model. Methods: A model of individual vaccination or infection status with respect to SARS-CoV-2 obtained from a machine learning procedure was proposed based on 3 complementary serological assays. This model was applied to the French nationwide serosurveillance survey from March 2020 to March 2022 to estimate the proportions of the population that were negative, infected, vaccinated, or infected and vaccinated. Results: From February 2021 to March 2022, the estimated percentage of infected and unvaccinated individuals in France increased from 7.5% to 16.8%. During this period, the estimated percentage increased from 3.6% to 45.2% for vaccinated and uninfected individuals and from 2.1% to 29.1% for vaccinated and infected individuals. The decrease in the seronegative population can be largely attributed to vaccination. Conclusions: Combining results from the serosurveillance survey with more complete data from another longitudinal cohort completes the information retrieved from serosurveillance while keeping its protocol simple and easy to implement. %M 38015594 %R 10.2196/46898 %U https://publichealth.jmir.org/2023/1/e46898 %U https://doi.org/10.2196/46898 %U http://www.ncbi.nlm.nih.gov/pubmed/38015594 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 7 %N %P e47762 %T The Readability and Quality of Web-Based Patient Information on Nasopharyngeal Carcinoma: Quantitative Content Analysis %A Tan,Denise Jia Yun %A Ko,Tsz Ki %A Fan,Ka Siu %+ Department of Surgery, Royal Stoke University Hospital, Newcastle Rd, Stoke on Trent, ST4 6QG, United Kingdom, 44 7378977812, tszkiko95@gmail.com %K nasopharyngeal cancer %K internet information %K readability %K Journal of the American Medical Association %K JAMA %K DISCERN %K artificial intelligence %K AI %D 2023 %7 27.11.2023 %9 Original Paper %J JMIR Form Res %G English %X Background: Nasopharyngeal carcinoma (NPC) is a rare disease that is strongly associated with exposure to the Epstein-Barr virus and is characterized by the formation of malignant cells in nasopharynx tissues. Early diagnosis of NPC is often difficult owing to the location of initial tumor sites and the nonspecificity of initial symptoms, resulting in a higher frequency of advanced-stage diagnoses and a poorer prognosis. Access to high-quality, readable information could improve the early detection of the disease and provide support to patients during disease management. Objective: This study aims to assess the quality and readability of publicly available web-based information in the English language about NPC, using the most popular search engines. Methods: Key terms relevant to NPC were searched across 3 of the most popular internet search engines: Google, Yahoo, and Bing. The top 25 results from each search engine were included in the analysis. Websites that contained text written in languages other than English, required paywall access, targeted medical professionals, or included nontext content were excluded. Readability for each website was assessed using the Flesch Reading Ease score and the Flesch-Kincaid grade level. Website quality was assessed using the Journal of the American Medical Association (JAMA) and DISCERN tools as well as the presence of a Health on the Net Foundation seal. Results: Overall, 57 suitable websites were included in this study; 26% (15/57) of the websites were academic. The mean JAMA and DISCERN scores of all websites were 2.80 (IQR 3) and 57.60 (IQR 19), respectively, with a median of 3 (IQR 2-4) and 61 (IQR 49-68), respectively. Health care industry websites (n=3) had the highest mean JAMA score of 4 (SD 0). Academic websites (15/57, 26%) had the highest mean DISCERN score of 77.5. The Health on the Net Foundation seal was present on only 1 website, which also achieved a JAMA score of 3 and a DISCERN score of 50. Significant differences were observed between the JAMA score of hospital websites and the scores of industry websites (P=.04), news service websites (P<.048), charity and nongovernmental organization websites (P=.03). Despite being a vital source for patients, general practitioner websites were found to have significantly lower JAMA scores compared with charity websites (P=.05). The overall mean readability scores reflected an average reading age of 14.3 (SD 1.1) years. Conclusions: The results of this study suggest an inconsistent and suboptimal quality of information related to NPC on the internet. On average, websites presented readability challenges, as written information about NPC was above the recommended reading level of sixth grade. As such, web-based information requires improvement in both quality and accessibility, and healthcare providers should be selective about information recommended to patients, ensuring they are reliable and readable. %M 38010802 %R 10.2196/47762 %U https://formative.jmir.org/2023/1/e47762 %U https://doi.org/10.2196/47762 %U http://www.ncbi.nlm.nih.gov/pubmed/38010802 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e37719 %T Identification of Key Influencers for Secondary Distribution of HIV Self-Testing Kits Among Chinese Men Who Have Sex With Men: Development of an Ensemble Machine Learning Approach %A Jing,Fengshi %A Ye,Yang %A Zhou,Yi %A Ni,Yuxin %A Yan,Xumeng %A Lu,Ying %A Ong,Jason %A Tucker,Joseph D %A Wu,Dan %A Xiong,Yuan %A Xu,Chen %A He,Xi %A Huang,Shanzi %A Li,Xiaofeng %A Jiang,Hongbo %A Wang,Cheng %A Dai,Wencan %A Huang,Liqun %A Mei,Wenhua %A Cheng,Weibin %A Zhang,Qingpeng %A Tang,Weiming %+ Institute for Healthcare Artificial Intelligence Application, Guangdong Second Provincial General Hospital, 466 Xingangzhong Road, Guangzhou, 510317, China, 86 15920567132, weiming_tang@med.unc.edu %K HIV self-testing %K machine learning %K MSM %K men who have sex with men %K secondary distribution %K key influencers identification %D 2023 %7 23.11.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: HIV self-testing (HIVST) has been rapidly scaled up and additional strategies further expand testing uptake. Secondary distribution involves people (defined as “indexes”) applying for multiple kits and subsequently sharing them with people (defined as “alters”) in their social networks. However, identifying key influencers is difficult. Objective: This study aimed to develop an innovative ensemble machine learning approach to identify key influencers among Chinese men who have sex with men (MSM) for secondary distribution of HIVST kits. Methods: We defined three types of key influencers: (1) key distributors who can distribute more kits, (2) key promoters who can contribute to finding first-time testing alters, and (3) key detectors who can help to find positive alters. Four machine learning models (logistic regression, support vector machine, decision tree, and random forest) were trained to identify key influencers. An ensemble learning algorithm was adopted to combine these 4 models. For comparison with our machine learning models, self-evaluated leadership scales were used as the human identification approach. Four metrics for performance evaluation, including accuracy, precision, recall, and F1-score, were used to evaluate the machine learning models and the human identification approach. Simulation experiments were carried out to validate our approach. Results: We included 309 indexes (our sample size) who were eligible and applied for multiple test kits; they distributed these kits to 269 alters. We compared the performance of the machine learning classification and ensemble learning models with that of the human identification approach based on leadership self-evaluated scales in terms of the 2 nearest cutoffs. Our approach outperformed human identification (based on the cutoff of the self-reported scales), exceeding by an average accuracy of 11.0%, could distribute 18.2% (95% CI 9.9%-26.5%) more kits, and find 13.6% (95% CI 1.9%-25.3%) more first-time testing alters and 12.0% (95% CI –14.7% to 38.7%) more positive-testing alters. Our approach could also increase the simulated intervention’s efficiency by 17.7% (95% CI –3.5% to 38.8%) compared to that of human identification. Conclusions: We built machine learning models to identify key influencers among Chinese MSM who were more likely to engage in secondary distribution of HIVST kits. Trial Registration: Chinese Clinical Trial Registry (ChiCTR) ChiCTR1900025433; https://www.chictr.org.cn/showproj.html?proj=42001 %M 37995110 %R 10.2196/37719 %U https://www.jmir.org/2023/1/e37719 %U https://doi.org/10.2196/37719 %U http://www.ncbi.nlm.nih.gov/pubmed/37995110 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 7 %N %P e44420 %T Patient Journey Toward a Diagnosis of Light Chain Amyloidosis in a National Sample: Cross-Sectional Web-Based Study %A Dou,Xuelin %A Liu,Yang %A Liao,Aijun %A Zhong,Yuping %A Fu,Rong %A Liu,Lihong %A Cui,Canchan %A Wang,Xiaohong %A Lu,Jin %+ Hematology Department, Peking University People's Hospital, 11 Xizhimen South Street, Beijing, 100044, China, 86 13311491805, jin1lu@sina.com.cn %K systemic light chain amyloidosis %K AL amyloidosis %K rare disease %K big data %K network analysis %K machine model %K natural language processing %K web-based %D 2023 %7 2.11.2023 %9 Original Paper %J JMIR Form Res %G English %X Background: Systemic light chain (AL) amyloidosis is a rare and multisystem disease associated with increased morbidity and a poor prognosis. Delayed diagnoses are common due to the heterogeneity of the symptoms. However, real-world insights from Chinese patients with AL amyloidosis have not been investigated. Objective: This study aimed to describe the journey to an AL amyloidosis diagnosis and to build an in-depth understanding of the diagnostic process from the perspective of both clinicians and patients to obtain a correct and timely diagnosis. Methods: Publicly available disease-related content from social media platforms between January 2008 and April 2021 was searched. After performing data collection steps with a machine model, a series of disease-related posts were extracted. Natural language processing was used to identify the relevance of variables, followed by further manual evaluation and analysis. Results: A total of 2204 valid posts related to AL amyloidosis were included in this study, of which 1968 were posted on haodf.com. Of these posts, 1284 were posted by men (median age 57, IQR 46-67 years); 1459 posts mentioned renal-related symptoms, followed by heart (n=833), liver (n=491), and stomach (n=368) symptoms. Furthermore, 1502 posts mentioned symptoms related to 2 or more organs. Symptoms for AL amyloidosis most frequently mentioned by suspected patients were nonspecific weakness (n=252), edema (n=196), hypertrophy (n=168), and swelling (n=140). Multiple physician visits were common, and nephrologists (n=265) and hematologists (n=214) were the most frequently visited specialists by suspected patients for initial consultation. Additionally, interhospital referrals were also commonly seen, centralizing in tertiary hospitals. Conclusions: Chinese patients with AL amyloidosis experienced referrals during their journey toward accurate diagnosis. Increasing awareness of the disease and early referral to a specialized center with expertise may reduce delayed diagnosis and improve patient management. %M 37917132 %R 10.2196/44420 %U https://formative.jmir.org/2023/1/e44420 %U https://doi.org/10.2196/44420 %U http://www.ncbi.nlm.nih.gov/pubmed/37917132 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e45085 %T Influenza Epidemic Trend Surveillance and Prediction Based on Search Engine Data: Deep Learning Model Study %A Yang,Liuyang %A Zhang,Ting %A Han,Xuan %A Yang,Jiao %A Sun,Yanxia %A Ma,Libing %A Chen,Jialong %A Li,Yanming %A Lai,Shengjie %A Li,Wei %A Feng,Luzhao %A Yang,Weizhong %+ School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, 9 Dong Dan San Tiao, Dongcheng District, Beijing, 100730, China, 86 010 65120552, yangweizhong@cams.cn %K early warning %K epidemic intelligence %K infectious disease %K influenza-like illness %K surveillance %D 2023 %7 17.10.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Influenza outbreaks pose a significant threat to global public health. Traditional surveillance systems and simple algorithms often struggle to predict influenza outbreaks in an accurate and timely manner. Big data and modern technology have offered new modalities for disease surveillance and prediction. Influenza-like illness can serve as a valuable surveillance tool for emerging respiratory infectious diseases like influenza and COVID-19, especially when reported case data may not fully reflect the actual epidemic curve. Objective: This study aimed to develop a predictive model for influenza outbreaks by combining Baidu search query data with traditional virological surveillance data. The goal was to improve early detection and preparedness for influenza outbreaks in both northern and southern China, providing evidence for supplementing modern intelligence epidemic surveillance methods. Methods: We collected virological data from the National Influenza Surveillance Network and Baidu search query data from January 2011 to July 2018, totaling 3,691,865 and 1,563,361 respective samples. Relevant search terms related to influenza were identified and analyzed for their correlation with influenza-positive rates using Pearson correlation analysis. A distributed lag nonlinear model was used to assess the lag correlation of the search terms with influenza activity. Subsequently, a predictive model based on the gated recurrent unit and multiple attention mechanisms was developed to forecast the influenza-positive trend. Results: This study revealed a high correlation between specific Baidu search terms and influenza-positive rates in both northern and southern China, except for 1 term. The search terms were categorized into 4 groups: essential facts on influenza, influenza symptoms, influenza treatment and medicine, and influenza prevention, all of which showed correlation with the influenza-positive rate. The influenza prevention and influenza symptom groups had a lag correlation of 1.4-3.2 and 5.0-8.0 days, respectively. The Baidu search terms could help predict the influenza-positive rate 14-22 days in advance in southern China but interfered with influenza surveillance in northern China. Conclusions: Complementing traditional disease surveillance systems with information from web-based data sources can aid in detecting warning signs of influenza outbreaks earlier. However, supplementation of modern surveillance with search engine information should be approached cautiously. This approach provides valuable insights for digital epidemiology and has the potential for broader application in respiratory infectious disease surveillance. Further research should explore the optimization and customization of search terms for different regions and languages to improve the accuracy of influenza prediction models. %M 37847532 %R 10.2196/45085 %U https://www.jmir.org/2023/1/e45085 %U https://doi.org/10.2196/45085 %U http://www.ncbi.nlm.nih.gov/pubmed/37847532 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 10 %N %P e49359 %T Identifying Rare Circumstances Preceding Female Firearm Suicides: Validating A Large Language Model Approach %A Zhou,Weipeng %A Prater,Laura C %A Goldstein,Evan V %A Mooney,Stephen J %+ Department of Epidemiology, School of Public Health, University of Washington, Hans Rosling Center for Population Health, 3980 15th Ave NE, Seattle, WA, 98195, United States, 1 206 685 1643, sjm2186@uw.edu %K female firearm suicide %K large language model %K document classification %K suicide prevention %K suicide %K firearm suicide %K machine learning %K mental health for women %K violent death %K mental health %K language models %K women %K female %K depression %K suicidal %D 2023 %7 17.10.2023 %9 Short Paper %J JMIR Ment Health %G English %X Background: Firearm suicide has been more prevalent among males, but age-adjusted female firearm suicide rates increased by 20% from 2010 to 2020, outpacing the rate increase among males by about 8 percentage points, and female firearm suicide may have different contributing circumstances. In the United States, the National Violent Death Reporting System (NVDRS) is a comprehensive source of data on violent deaths and includes unstructured incident narrative reports from coroners or medical examiners and law enforcement. Conventional natural language processing approaches have been used to identify common circumstances preceding female firearm suicide deaths but failed to identify rarer circumstances due to insufficient training data. Objective: This study aimed to leverage a large language model approach to identify infrequent circumstances preceding female firearm suicide in the unstructured coroners or medical examiners and law enforcement narrative reports available in the NVDRS. Methods: We used the narrative reports of 1462 female firearm suicide decedents in the NVDRS from 2014 to 2018. The reports were written in English. We coded 9 infrequent circumstances preceding female firearm suicides. We experimented with predicting those circumstances by leveraging a large language model approach in a yes/no question-answer format. We measured the prediction accuracy with F1-score (ranging from 0 to 1). F1-score is the harmonic mean of precision (positive predictive value) and recall (true positive rate or sensitivity). Results: Our large language model outperformed a conventional support vector machine–supervised machine learning approach by a wide margin. Compared to the support vector machine model, which had F1-scores less than 0.2 for most infrequent circumstances, our large language model approach achieved an F1-score of over 0.6 for 4 circumstances and 0.8 for 2 circumstances. Conclusions: The use of a large language model approach shows promise. Researchers interested in using natural language processing to identify infrequent circumstances in narrative report data may benefit from large language models. %M 37847549 %R 10.2196/49359 %U https://mental.jmir.org/2023/1/e49359 %U https://doi.org/10.2196/49359 %U http://www.ncbi.nlm.nih.gov/pubmed/37847549 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e42758 %T Conversational AI and Vaccine Communication: Systematic Review of the Evidence %A Passanante,Aly %A Pertwee,Ed %A Lin,Leesa %A Lee,Kristi Yoonsup %A Wu,Joseph T %A Larson,Heidi J %+ Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, United Kingdom, 44 2076368636, aly.passanante@lshtm.ac.uk %K chatbots %K artificial intelligence %K conversational AI %K vaccine communication %K vaccine hesitancy %K conversational agent %K COVID-19 %K vaccine information %K health information %D 2023 %7 3.10.2023 %9 Review %J J Med Internet Res %G English %X Background: Since the mid-2010s, use of conversational artificial intelligence (AI; chatbots) in health care has expanded significantly, especially in the context of increased burdens on health systems and restrictions on in-person consultations with health care providers during the COVID-19 pandemic. One emerging use for conversational AI is to capture evolving questions and communicate information about vaccines and vaccination. Objective: The objective of this systematic review was to examine documented uses and evidence on the effectiveness of conversational AI for vaccine communication. Methods: This systematic review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. PubMed, Web of Science, PsycINFO, MEDLINE, Scopus, CINAHL Complete, Cochrane Library, Embase, Epistemonikos, Global Health, Global Index Medicus, Academic Search Complete, and the University of London library database were searched for papers on the use of conversational AI for vaccine communication. The inclusion criteria were studies that included (1) documented instances of conversational AI being used for the purpose of vaccine communication and (2) evaluation data on the impact and effectiveness of the intervention. Results: After duplicates were removed, the review identified 496 unique records, which were then screened by title and abstract, of which 38 were identified for full-text review. Seven fit the inclusion criteria and were assessed and summarized in the findings of this review. Overall, vaccine chatbots deployed to date have been relatively simple in their design and have mainly been used to provide factual information to users in response to their questions about vaccines. Additionally, chatbots have been used for vaccination scheduling, appointment reminders, debunking misinformation, and, in some cases, for vaccine counseling and persuasion. Available evidence suggests that chatbots can have a positive effect on vaccine attitudes; however, studies were typically exploratory in nature, and some lacked a control group or had very small sample sizes. Conclusions: The review found evidence of potential benefits from conversational AI for vaccine communication. Factors that may contribute to the effectiveness of vaccine chatbots include their ability to provide credible and personalized information in real time, the familiarity and accessibility of the chatbot platform, and the extent to which interactions with the chatbot feel “natural” to users. However, evaluations have focused on the short-term, direct effects of chatbots on their users. The potential longer-term and societal impacts of conversational AI have yet to be analyzed. In addition, existing studies do not adequately address how ethics apply in the field of conversational AI around vaccines. In a context where further digitalization of vaccine communication can be anticipated, additional high-quality research will be required across all these areas. %M 37788057 %R 10.2196/42758 %U https://www.jmir.org/2023/1/e42758 %U https://doi.org/10.2196/42758 %U http://www.ncbi.nlm.nih.gov/pubmed/37788057 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 7 %N %P e49898 %T Parkinson Disease Recognition Using a Gamified Website: Machine Learning Development and Usability Study %A Parab,Shubham %A Boster,Jerry %A Washington,Peter %+ Department of Information & Computer Sciences, University of Hawaii at Manoa, 2500 Campus Rd, Honolulu, HI, 96822, United States, 1 1 512 680 0926, pyw@hawaii.edu %K Parkinson disease %K digital health %K machine learning %K remote screening %K accessible screening %D 2023 %7 29.9.2023 %9 Original Paper %J JMIR Form Res %G English %X Background: Parkinson disease (PD) affects millions globally, causing motor function impairments. Early detection is vital, and diverse data sources aid diagnosis. We focus on lower arm movements during keyboard and trackpad or touchscreen interactions, which serve as reliable indicators of PD. Previous works explore keyboard tapping and unstructured device monitoring; we attempt to further these works with structured tests taking into account 2D hand movement in addition to finger tapping. Our feasibility study uses keystroke and mouse movement data from a remotely conducted, structured, web-based test combined with self-reported PD status to create a predictive model for detecting the presence of PD. Objective: Analysis of finger tapping speed and accuracy through keyboard input and analysis of 2D hand movement through mouse input allowed differentiation between participants with and without PD. This comparative analysis enables us to establish clear distinctions between the two groups and explore the feasibility of using motor behavior to predict the presence of the disease. Methods: Participants were recruited via email by the Hawaii Parkinson Association (HPA) and directed to a web application for the tests. The 2023 HPA symposium was also used as a forum to recruit participants and spread information about our study. The application recorded participant demographics, including age, gender, and race, as well as PD status. We conducted a series of tests to assess finger tapping, using on-screen prompts to request key presses of constant and random keys. Response times, accuracy, and unintended movements resulting in accidental presses were recorded. Participants performed a hand movement test consisting of tracing straight and curved on-screen ribbons using a trackpad or mouse, allowing us to evaluate stability and precision of 2D hand movement. From this tracing, the test collected and stored insights concerning lower arm motor movement. Results: Our formative study included 31 participants, 18 without PD and 13 with PD, and analyzed their lower limb movement data collected from keyboards and computer mice. From the data set, we extracted 28 features and evaluated their significances using an extra tree classifier predictor. A random forest model was trained using the 6 most important features identified by the predictor. These selected features provided insights into precision and movement speed derived from keyboard tapping and mouse tracing tests. This final model achieved an average F1-score of 0.7311 (SD 0.1663) and an average accuracy of 0.7429 (SD 0.1400) over 20 runs for predicting the presence of PD. Conclusions: This preliminary feasibility study suggests the possibility of using technology-based limb movement data to predict the presence of PD, demonstrating the practicality of implementing this approach in a cost-effective and accessible manner. In addition, this study demonstrates that structured mouse movement tests can be used in combination with finger tapping to detect PD. %M 37773607 %R 10.2196/49898 %U https://formative.jmir.org/2023/1/e49898 %U https://doi.org/10.2196/49898 %U http://www.ncbi.nlm.nih.gov/pubmed/37773607 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e45019 %T Hot Topic Recognition of Health Rumors Based on Anti-Rumor Articles on the WeChat Official Account Platform: Topic Modeling %A Li,Ziyu %A Wu,Xiaoqian %A Xu,Lin %A Liu,Ming %A Huang,Cheng %+ Chongqing Medical University, College of Medical Informatics, No.1 Medical College Road, Yuzhong District, Chongqing, 400016, China, 86 023 6848 0060, huangcheng@cqmu.edu.cn %K topic model %K health rumors %K social media %K WeChat official account %K content analysis %K public health %K machine learning %K Twitter %K social network %K misinformation %K users %K public health %K disease %K diet %D 2023 %7 21.9.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Social networks have become one of the main channels for obtaining health information. However, they have also become a source of health-related misinformation, which seriously threatens the public’s physical and mental health. Governance of health-related misinformation can be implemented through topic identification of rumors on social networks. However, little attention has been paid to studying the types and routes of dissemination of health rumors on the internet, especially rumors regarding health-related information in Chinese social media. Objective: This study aims to explore the types of health-related misinformation favored by WeChat public platform users and their prevalence trends and to analyze the modeling results of the text by using the Latent Dirichlet Allocation model. Methods: We used a web crawler tool to capture health rumor–dispelling articles on WeChat rumor-dispelling public accounts. We collected information from health-debunking articles posted between January 1, 2016, and August 31, 2022. Following word segmentation of the collected text, a document topic generation model called Latent Dirichlet Allocation was used to identify and generalize the most common topics. The proportion distribution of the themes was calculated, and the negative impact of various health rumors in different periods was analyzed. Additionally, the prevalence of health rumors was analyzed by the number of health rumors generated at each time point. Results: We collected 9366 rumor-refuting articles from January 1, 2016, to August 31, 2022, from WeChat official accounts. Through topic modeling, we divided the health rumors into 8 topics, that is, rumors on prevention and treatment of infectious diseases (1284/9366, 13.71%), disease therapy and its effects (1037/9366, 11.07%), food safety (1243/9366, 13.27%), cancer and its causes (946/9366, 10.10%), regimen and disease (1540/9366, 16.44%), transmission (914/9366, 9.76%), healthy diet (1068/9366, 11.40%), and nutrition and health (1334/9366, 14.24%). Furthermore, we summarized the 8 topics under 4 themes, that is, public health, disease, diet and health, and spread of rumors. Conclusions: Our study shows that topic modeling can provide analysis and insights into health rumor governance. The rumor development trends showed that most rumors were on public health, disease, and diet and health problems. Governments still need to implement relevant and comprehensive rumor management strategies based on the rumors prevalent in their countries and formulate appropriate policies. Apart from regulating the content disseminated on social media platforms, the national quality of health education should also be improved. Governance of social networks should be clearly implemented, as these rapidly developed platforms come with privacy issues. Both disseminators and receivers of information should ensure a realistic attitude and disseminate health information correctly. In addition, we recommend that sentiment analysis–related studies be conducted to verify the impact of health rumor–related topics. %M 37733396 %R 10.2196/45019 %U https://www.jmir.org/2023/1/e45019 %U https://doi.org/10.2196/45019 %U http://www.ncbi.nlm.nih.gov/pubmed/37733396 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e46523 %T Bot or Not? Detecting and Managing Participant Deception When Conducting Digital Research Remotely: Case Study of a Randomized Controlled Trial %A Loebenberg,Gemma %A Oldham,Melissa %A Brown,Jamie %A Dinu,Larisa %A Michie,Susan %A Field,Matt %A Greaves,Felix %A Garnett,Claire %+ UCL Tobacco and Alcohol Research Group, University College London, 1-19 Torrington Place, London, WC1E 7HB, United Kingdom, 44 20 7679 8781, gemma.loebenberg@ucl.ac.uk %K artificial intelligence %K false information %K mHealth applications %K participant deception %K participant %K recruit %K research subject %K web-based studies %D 2023 %7 14.9.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Evaluating digital interventions using remote methods enables the recruitment of large numbers of participants relatively conveniently and cheaply compared with in-person methods. However, conducting research remotely based on participant self-report with little verification is open to automated “bots” and participant deception. Objective: This paper uses a case study of a remotely conducted trial of an alcohol reduction app to highlight and discuss (1) the issues with participant deception affecting remote research trials with financial compensation; and (2) the importance of rigorous data management to detect and address these issues. Methods: We recruited participants on the internet from July 2020 to March 2022 for a randomized controlled trial (n=5602) evaluating the effectiveness of an alcohol reduction app, Drink Less. Follow-up occurred at 3 time points, with financial compensation offered (up to £36 [US $39.23]). Address authentication and telephone verification were used to detect 2 kinds of deception: “bots,” that is, automated responses generated in clusters; and manual participant deception, that is, participants providing false information. Results: Of the 1142 participants who enrolled in the first 2 months of recruitment, 75.6% (n=863) of them were identified as bots during data screening. As a result, a CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) was added, and after this, no more bots were identified. Manual participant deception occurred throughout the study. Of the 5956 participants (excluding bots) who enrolled in the study, 298 (5%) were identified as false participants. The extent of this decreased from 110 in November 2020, to a negligible level by February 2022 including a number of months with 0. The decline occurred after we added further screening questions such as attention checks, removed the prominence of financial compensation from social media advertising, and added an additional requirement to provide a mobile phone number for identity verification. Conclusions: Data management protocols are necessary to detect automated bots and manual participant deception in remotely conducted trials. Bots and manual deception can be minimized by adding a CAPTCHA, attention checks, a requirement to provide a phone number for identity verification, and not prominently advertising financial compensation on social media. Trial Registration: ISRCTN Number ISRCTN64052601; https://doi.org/10.1186/ISRCTN64052601 %M 37707943 %R 10.2196/46523 %U https://www.jmir.org/2023/1/e46523 %U https://doi.org/10.2196/46523 %U http://www.ncbi.nlm.nih.gov/pubmed/37707943 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 7 %N %P e42756 %T Identification of Risk Groups for and Factors Affecting Metabolic Syndrome in South Korean Single-Person Households Using Latent Class Analysis and Machine Learning Techniques: Secondary Analysis Study %A Lee,Ji-Soo %A Lee,Soo-Kyoung %+ Big Data Convergence and Open Sharing System, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, Republic of Korea, 82 2 889 5710, soo1005s@gmail.com %K latent class analysis %K machine learning %K metabolic syndrome %K risk factor %K single-person households %D 2023 %7 12.9.2023 %9 Original Paper %J JMIR Form Res %G English %X Background: The rapid increase of single-person households in South Korea is leading to an increase in the incidence of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases, due to lifestyle changes. It is necessary to analyze the complex effects of metabolic syndrome risk factors in South Korean single-person households, which differ from one household to another, considering the diversity of single-person households. Objective: This study aimed to identify the factors affecting metabolic syndrome in single-person households using machine learning techniques and categorically characterize the risk factors through latent class analysis (LCA). Methods: This cross-sectional study included 10-year secondary data obtained from the National Health and Nutrition Examination Survey (2009-2018). We selected 1371 participants belonging to single-person households. Data were analyzed using SPSS (version 25.0; IBM Corp), Mplus (version 8.0; Muthen & Muthen), and Python (version 3.0; Plone & Python). We applied 4 machine learning algorithms (logistic regression, decision tree, random forest, and extreme gradient boost) to identify important factors and then applied LCA to categorize the risk groups of metabolic syndromes in single-person households. Results: Through LCA, participants were classified into 4 groups (group 1: intense physical activity in early adulthood, group 2: hypertension among middle-aged female respondents, group 3: smoking and drinking among middle-aged male respondents, and group 4: obesity and abdominal obesity among middle-aged respondents). In addition, age, BMI, obesity, subjective body shape recognition, alcohol consumption, smoking, binge drinking frequency, and job type were investigated as common factors that affect metabolic syndrome in single-person households through machine learning techniques. Group 4 was the most susceptible and at-risk group for metabolic syndrome (odds ratio 17.67, 95% CI 14.5-25.3; P<.001), and obesity and abdominal obesity were the most influential risk factors for metabolic syndrome. Conclusions: This study identified risk groups and factors affecting metabolic syndrome in single-person households through machine learning techniques and LCA. Through these findings, customized interventions for each generational risk factor for metabolic syndrome can be implemented, leading to the prevention of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases. In conclusion, this study contributes to the prevention of metabolic syndrome in single-person households by providing new insights and priority groups for the development of customized interventions using classification. %M 37698907 %R 10.2196/42756 %U https://formative.jmir.org/2023/1/e42756 %U https://doi.org/10.2196/42756 %U http://www.ncbi.nlm.nih.gov/pubmed/37698907 %0 Journal Article %@ 2369-3762 %I JMIR Publications %V 9 %N %P e48254 %T Assessing Health Students' Attitudes and Usage of ChatGPT in Jordan: Validation Study %A Sallam,Malik %A Salim,Nesreen A %A Barakat,Muna %A Al-Mahzoum,Kholoud %A Al-Tammemi,Ala'a B %A Malaeb,Diana %A Hallit,Rabih %A Hallit,Souheil %+ Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Queen Rania Al-Abdullah Street-Aljubeiha, Amman, 11942, Jordan, 962 0791845186, malik.sallam@ju.edu.jo %K artificial intelligence %K machine learning %K education %K technology %K healthcare %K survey %K opinion %K knowledge %K practices %K KAP %D 2023 %7 5.9.2023 %9 Original Paper %J JMIR Med Educ %G English %X Background: ChatGPT is a conversational large language model that has the potential to revolutionize knowledge acquisition. However, the impact of this technology on the quality of education is still unknown considering the risks and concerns surrounding ChatGPT use. Therefore, it is necessary to assess the usability and acceptability of this promising tool. As an innovative technology, the intention to use ChatGPT can be studied in the context of the technology acceptance model (TAM). Objective: This study aimed to develop and validate a TAM-based survey instrument called TAME-ChatGPT (Technology Acceptance Model Edited to Assess ChatGPT Adoption) that could be employed to examine the successful integration and use of ChatGPT in health care education. Methods: The survey tool was created based on the TAM framework. It comprised 13 items for participants who heard of ChatGPT but did not use it and 23 items for participants who used ChatGPT. Using a convenient sampling approach, the survey link was circulated electronically among university students between February and March 2023. Exploratory factor analysis (EFA) was used to assess the construct validity of the survey instrument. Results: The final sample comprised 458 respondents, the majority among them undergraduate students (n=442, 96.5%). Only 109 (23.8%) respondents had heard of ChatGPT prior to participation and only 55 (11.3%) self-reported ChatGPT use before the study. EFA analysis on the attitude and usage scales showed significant Bartlett tests of sphericity scores (P<.001) and adequate Kaiser-Meyer-Olkin measures (0.823 for the attitude scale and 0.702 for the usage scale), confirming the factorability of the correlation matrices. The EFA showed that 3 constructs explained a cumulative total of 69.3% variance in the attitude scale, and these subscales represented perceived risks, attitude to technology/social influence, and anxiety. For the ChatGPT usage scale, EFA showed that 4 constructs explained a cumulative total of 72% variance in the data and comprised the perceived usefulness, perceived risks, perceived ease of use, and behavior/cognitive factors. All the ChatGPT attitude and usage subscales showed good reliability with Cronbach α values >.78 for all the deduced subscales. Conclusions: The TAME-ChatGPT demonstrated good reliability, validity, and usefulness in assessing health care students’ attitudes toward ChatGPT. The findings highlighted the importance of considering risk perceptions, usefulness, ease of use, attitudes toward technology, and behavioral factors when adopting ChatGPT as a tool in health care education. This information can aid the stakeholders in creating strategies to support the optimal and ethical use of ChatGPT and to identify the potential challenges hindering its successful implementation. Future research is recommended to guide the effective adoption of ChatGPT in health care education. %M 37578934 %R 10.2196/48254 %U https://mededu.jmir.org/2023/1/e48254 %U https://doi.org/10.2196/48254 %U http://www.ncbi.nlm.nih.gov/pubmed/37578934 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 15 %N %P e50934 %T Framework for Classifying Explainable Artificial Intelligence (XAI) Algorithms in Clinical Medicine %A Gniadek,Thomas %A Kang,Jason %A Theparee,Talent %A Krive,Jacob %+ Department of Biomedical and Health Information Sciences, University of Illinois at Chicago, 1919 W Taylor St 233 AHSB, MC-530, Chicago, IL, 60612, United States, 1 312 996 1445, krive@uic.edu %K explainable artificial intelligence %K XAI %K artificial intelligence %K AI %K AI medicine %K pathology informatics %K radiology informatics %D 2023 %7 1.9.2023 %9 Viewpoint %J Online J Public Health Inform %G English %X Artificial intelligence (AI) applied to medicine offers immense promise, in addition to safety and regulatory concerns. Traditional AI produces a core algorithm result, typically without a measure of statistical confidence or an explanation of its biological-theoretical basis. Efforts are underway to develop explainable AI (XAI) algorithms that not only produce a result but also an explanation to support that result. Here we present a framework for classifying XAI algorithms applied to clinical medicine: An algorithm’s clinical scope is defined by whether the core algorithm output leads to observations (eg, tests, imaging, clinical evaluation), interventions (eg, procedures, medications), diagnoses, and prognostication. Explanations are classified by whether they provide empiric statistical information, association with a historical population or populations, or association with an established disease mechanism or mechanisms. XAI implementations can be classified based on whether algorithm training and validation took into account the actions of health care providers in response to the insights and explanations provided or whether training was performed using only the core algorithm output as the end point. Finally, communication modalities used to convey an XAI explanation can be used to classify algorithms and may affect clinical outcomes. This framework can be used when designing, evaluating, and comparing XAI algorithms applied to medicine. %M 38046562 %R 10.2196/50934 %U https://ojphi.jmir.org/2023/1/e50934 %U https://doi.org/10.2196/50934 %U http://www.ncbi.nlm.nih.gov/pubmed/38046562 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 14 %N 1 %P e12851 %T Roles of Health Literacy in Relation to Social Determinants of Health and Recommendations for Informatics-Based Interventions: Systematic Review %D 2022 %7 ..2022 %9 %J Online J Public Health Inform %G English %X Objective: There is a low rate of online patient portal utilization in the U.S. This study aimed to utilize a machine learning approach to predict access to online medical records through a patient portal.Methods: This is a cross-sectional predictive machine learning algorithm-based study of Health Information National Trends datasets (Cycles 1 and 2; 2017-2018 samples). Survey respondents were U.S. adults (≥18 years old). The primary outcome was a binary variable indicating that the patient had or had not accessed online medical records in the previous 12 months. We analyzed a subset of independent variables using k-means clustering with replicate samples. A cross-validated random forest-based algorithm was utilized to select features for a Cycle 1 split training sample. A logistic regression and an evolved decision tree were trained on the rest of the Cycle 1 training sample. The Cycle 1 test sample and Cycle 2 data were used to benchmark algorithm performance.Results: Lack of access to online systems was less of a barrier to online medical records in 2018 (14%) compared to 2017 (26%). Patients accessed medical records to refill medicines and message primary care providers more frequently in 2018 (45%) than in 2017 (25%).Discussion: Privacy concerns, portal knowledge, and conversations between primary care providers and patients predict portal access.Conclusion: Methods described here may be employed to personalize methods of patient engagement during new patient registration. %M 36685053 %R 10.5210/ojphi.v14i1.12851 %U %U https://doi.org/10.5210/ojphi.v14i1.12851 %U http://www.ncbi.nlm.nih.gov/pubmed/36685053 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 9 %N 1 %P e7605 %T Roles of Health Literacy in Relation to Social Determinants of Health and Recommendations for Informatics-Based Interventions: Systematic Review %D 2017 %7 ..2017 %9 %J Online J Public Health Inform %G English %X ObjectiveTo explain the utility of using an automated syndromic surveillanceprogram with advanced natural language processing (NLP) to improveclinical quality measures reporting for influenza immunization.IntroductionClinical quality measures (CQMs) are tools that help measure andtrack the quality of health care services. Measuring and reportingCQMs helps to ensure that our health care system is deliveringeffective, safe, efficient, patient-centered, equitable, and timely care.The CQM for influenza immunization measures the percentage ofpatients aged 6 months and older seen for a visit between October1 and March 31 who received (or reports previous receipt of) aninfluenza immunization. Centers for Disease Control and Preventionrecommends that everyone 6 months of age and older receive aninfluenza immunization every season, which can reduce influenza-related morbidity and mortality and hospitalizations.MethodsPatients at a large academic medical center who had a visit toan affiliated outpatient clinic during June 1 - 8, 2016 were initiallyidentified using their electronic medical record (EMR). The 2,543patients who were selected did not have documentation of influenzaimmunization in a discrete field of the EMR. All free text notes forthese patients between August 1, 2015 and March 31, 2016 wereretrieved and analyzed using the sophisticated NLP built withinGeographic Utilization of Artificial Intelligence in Real-Timefor Disease Identification and Alert Notification (GUARDIAN)– a syndromic surveillance program – to identify any mention ofinfluenza immunization. The goal was to identify additional cases thatmet the CQM measure for influenza immunization and to distinguishdocumented exceptions. The patients with influenza immunizationmentioned were further categorized by GUARDIAN NLP intoReceived, Recommended, Refused, Allergic, and Unavailable.If more than one category was applicable for a patient, they wereindependently counted in their respective categories. A descriptiveanalysis was conducted, along with manual review of a sample ofcases per each category.ResultsFor the 2,543 patients who did not have influenza immunizationdocumentation in a discrete field of the EMR, a total of 78,642 freetext notes were processed using GUARDIAN. Four hundred fiftythree (17.8%) patients had some mention of influenza immunizationwithin the notes, which could potentially be utilized to meet the CQMinfluenza immunization requirement. Twenty two percent (n=101)of patients mentioned already having received the immunizationwhile 34.7% (n=157) patients refused it during the study time frame.There were 27 patients with the mention of influenza immunization,who could not be differentiated into a specific category. The numberof patients placed into a single category of influenza immunizationwas 351 (77.5%), while 75 (16.6%) were classified into more thanone category. See Table 1.ConclusionsUsing GUARDIAN’s NLP can identify additional patients whomay meet the CQM measure for influenza immunization or whomay be exempt. This tool can be used to improve CQM reportingand improve overall influenza immunization coverage by using it toalert providers. Next steps involve further refinement of influenzaimmunization categories, automating the process of using the NLPto identify and report additional cases, as well as using the NLP forother CQMs.Table 1. Categorization of influenza immunization documentation within freetext notes of 453 patients using NLP %R 10.5210/ojphi.v9i1.7605 %U %U https://doi.org/10.5210/ojphi.v9i1.7605 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 9 %N 1 %P e7650 %T Roles of Health Literacy in Relation to Social Determinants of Health and Recommendations for Informatics-Based Interventions: Systematic Review %D 2017 %7 ..2017 %9 %J Online J Public Health Inform %G English %X ObjectiveTo evaluate prediction of laboratory diagnosis of acute respiratoryinfection (ARI) from participatory data using machine learningmodels.IntroductionARIs have epidemic and pandemic potential. Prediction of presenceof ARIs from individual signs and symptoms in existing studieshave been based on clinically-sourced data1. Clinical data generallyrepresents the most severe cases, and those from locations with accessto healthcare institutions. Thus, the viral information that comes fromclinical sampling is insufficient to either capture disease incidence ingeneral populations or its predictability from symptoms. Participatorydata — information that individuals today can produce on their own— enabled by the ubiquity of digital tools, can help fill this gap byproviding self-reported data from the community. Internet-basedparticipatory efforts such as Flu Near You2have augmented existingARI surveillance through early and widespread detection of outbreaksand public health trends.MethodsThe GoViral platform3was established to obtain self-reportedsymptoms and diagnostic specimens from the community (Table 1summarizes participation detail). Participants from states with themost data, MA, NY, CT, NH, and CA were included. Age, gender,zip code, and vaccination status were requested from each participant.Participants submitted saliva and nasal swab specimens and reportedsymptoms from: fever, cough, sore throat, shortness of breath, chills,fatigue, body aches, headache, nausea, and diarrhea. Pathogenswere confirmed via RT-PCR on a GenMark respiratory panel assay(full virus list reported previously3).Observations with missing, invalid or equivocal lab tests wereremoved. Table 2 summarizes the binary features. Age categorieswere:≤20, > 20 and < 40, and≥40 to represent young, middle-aged, and old. Missing age and gender values were imputed based onoverall distributions.Three machine learning algorithms—Support Vector Machines(SVMs)4, Random Forests (RFs)5, and Logistic Regression (LR) wereconsidered. Both individual features and their combinations wereassessed. Outcome was the presence (1) or absence (0) of laboratorydiagnosis of ARI.ResultsTen-fold cross validation was repeated ten times. Evaluationsmetrics used were: positive predictive value (PPV), negativepredictive value (NPV), sensitivity, and specificity6. LR and SVMsyielded the best PPV of 0.64 (standard deviation:±0.08) with coughand fever as predictors. The best sensitivity of 0.59 (±0.14) was fromLR using cough, fever, and sore throat. RFs had the best NPV andspecificity of 0.62 (±0.15) and 0.83 (±0.10) respectively with theCDC ILI symptom profile of fever and (cough or sore throat). Addingdemographics and vaccination status did not improve performanceof the classifiers. Results are consistent with studies using clinically-sourced data: cough and fever together were found to be the bestpredictors of flu-like illness1. Because our data include mildlyinfectious and asymptomatic cases, the classifier sensitivity and PPVare low compared to results from clinical data.ConclusionsEvidence of fever and cough together are good predictors of ARIin the community, but clinical data may overestimate this due tosampling bias. Integration of participatory data can not only improvepopulation health by actively engaging the general public2but alsoimprove the scope of studies solely based on clinically-sourcedsurveillance data.Table 1. Details of included participants.Table 2. Coding of binary features %R 10.5210/ojphi.v9i1.7650 %U %U https://doi.org/10.5210/ojphi.v9i1.7650