Adverse to extract AVRs from such data sources~cite{jonnagaddala2016binary,rastegar2016prioritizing,yang2015filtering}. Clinical

Adverse drug events (AVRs), defined as the set of detriments or injuries caused by a medication, have led to additional medical costs, prolonged hospitalization, morbidity and ascribable disability worldwide~cite{gottlieb2015ranking,bates1995incidence,classen1997adverse,lazarou1998incidence}. AVRs encompass all adverse drug reactions (ADRs), but also include preventable causes of errors such as inappropriate dosing, dispensing errors, drug abuse, etc. Discovery of AVRs has gained great attention in the healthcare community, and in the last few years several drug risk-benefit assessment strategies have been developed to analyze drug efficacy and safety using different medical data sources ranging from~cite{van2015relative, price2016can, liu2016ensemble,trame2016systems}. A variety of combined computational methods using natural language processing (NLP), machine learning strategies and text retrieval algorithms have been employed to extract AVRs from such data sources~cite{jonnagaddala2016binary,rastegar2016prioritizing,yang2015filtering}. Clinical trials, electronic health records (EHRs) and medical case reports are additional biomedical data-rich sources which have been utilized for AVR extraction~cite{liang2016dl, iyer2014mining, harpaz2012novel}. In recent years, biomedical articles produced by scientists all across the world have grown extensively. Figure 1 shows that the number of journal and conference papers published in different medication studies (e.g., AVRs and drug analysis, drug evaluation, and drug repositioning) rapidly grew in number from year 2000 to 2014. The total number of publications in those years is approximately 461,305 articles. In order to roughly estimate the size of such scientific papers, we assumed a PDF file format for each article. The size of a PDF file depends on the number of pages and pictures or meta-data inside the file. Considering a 9-11 page PDF including plain text along with a few pictures, it may equal almost 2.5 MB size in average, and it appears approximately 1.15 TB articles were generated in drug associated studies within the years of 2000 to 2014. The other file formats, such as XML, may be much larger in size. Scientific articles published in biomedical research are usually generated using standardized and principled methods and therefore, are especially valuable for high quality knowledge discovery. This great deluge of information includes an enormous number of scientific publications on AVRs study, an area of focus into which many biomedical researchers have entered, developing a variety of research activities for discovering, analyzing and monitoring AVRscite{naples2016recent, noguchi2016prevention, nuckols2014effectiveness}. It is impossible for researchers, scientists and physicians to read and process the large body of scientific articles and remain abreast of the foremost information regarding AVRs. Therefore, there is a pressing need to develop intelligent computational methods, particularly big data analytics solutions, to efficiently process this wealth of data. Big data biomedical text analysis utilizes advanced computational technologies, including big data infrastructure, NLP, statistical analytics and machine learning algorithms to extract facts from text data. This in turn generates new hypotheses by systematically analyzing large numbers of scientific publications. subsection{Objectives}While AVR discovery from diverse biomedical data sources in general has been studied historically in healthcare informatics, the use of big data scientific articles and health-related social media for AVR discovery has been very limited so far. The motivation of the current work is to study big data machine learning solutions, particularly, big data neural networks (bigNN) to analyze AVRs from large-scale biomedical text data, developing a scalable framework to fulfill the following objectives: (1) To extract current knowledge and high quality information about AVRs using full text scientific articles and social media, (2) To utilize and adapt advanced NLP and machine learning algorithms in a large-scale fashion by the use of big data infrastructures, and (3) To provide better insights and tendencies in large-scale biomedical text analytics and identify the challenges and potential enhancements towards efficient and accurate AVR discovery. We briefly summarize our main contributions as

BACK TO TOP
x

Hi!
I'm Rhonda!

Would you like to get a custom essay? How about receiving a customized one?

Check it out