Harnessing Big Data in Ophthalmology

Abstract: Big data is the new gold, especially in health care. Advances in collecting and processing Electronic Medical Records (EMR), coupled with increasing computer capabilities have resulted in an increased interest in the use of big data in health care. Big data require collection and analysis of data at an unprecedented scale and represents a paradigm shift in health care, offering on one hand the capacity to generate new knowledge more quickly than traditional scientific approaches, and, on the other hand, a holistic understanding of specific illnesses when socio-demographics are incorporated in. Big data promises more personalized and precision medicine for patients with improved accuracy and earlier diagnosis, and therapy geared to an individual’s unique combination of genes, environmental risk, and precise disease phenotype.


Ophthalmology has been an area of focus where results have shown to be promising. The objective of this study was to determine whether the EMR at LV Prasad Eye Institute (LVPEI), in Hyderabad, India, can contribute to the management of patient care, through studying how climatic and socio-demographic factors relate to eye disorders and visual impairment in the State of Telangana. The study was designed by merging a dataset obtained from the Telangana State Development Society to an existing EMR of approximately 1 million patients, who presented themselves with different eye symptoms and diagnosed with several diseases from the years (2011-2019), a timeframe of 8 years. The dataset obtained included weather and climatic variables to be tested alongside eye disorders. Microsoft Power BI was used to analyze the data through prescriptive and descriptive data analysis techniques to read patterns that can dig deeper into high-risk climatic and socio-demographic factors that correlate to eye diseases. Eye health risk in India has been linked to poor air quality and pollution. A total of 176 variables were studied, and they ranged from number of patient visits, type of complaints, ocular diagnosis, among others, all aligned to climatic and demographic categories, such as gender, age, occupation, temperature and rainfall. AI creative featuring techniques have been used to narrow down the variables most affected by climatic and demographic factors, with the application of the Cynefin Framework as a guide to simplify and structure the dataset for analysis.

Our findings revealed a high presence of Cataract in the state of Telangana, mostly in rural areas and throughout the different weather seasons in India. Males tend to be the most affected as per the number of visits to the clinic, while home makersmake the most visit to the hospital, in addition to employees, students, and laborers. While cataract is most dominant in the older age population, diseases such as Astigmatism, Conjunctivitis and Emmetropia, are more present in the younger age population. The study appeared useful for taking preventive measures in the future to manage the treatment of patients who present themselves with eye disorders in Telangana. In addition, this research created a pathway for new methods in the study of how EMRs contribute to new knowledge in ophthalmology.

Keywords: big data, ophthalmology, ocular disease, artificial intelligence


  1. Corresponding Author: First Name Last Name, Work Postal Address/Physical Address, Department, Affiliated Organization, City, State, Postal Code, Country. email: address@email.edu

Journal Title

Volume #, Issue #, 20##, https://.com

© Common Ground Research Networks, Author(s) Name(s), All Rights Reserved.

Permissions: cgscholar.com/cg_support

ISSN: ####-#### (Print), ISSN: ####-#### (Online)

https://doi.org/###################### (Article)


ndia is home to over 8.3 million people with Vision Impairment (VI), the highest in the world [1]. Even though, in 1976, India became the first country in the world to start a national program for control of blindness for the goal to reduce blindness prevalence to 0.3 percent by 2020, the prevalence of blindness still stands at 1.99 percent, according to the National Blindness and Visual Impairment Survey, released in October 2019 [2] by the Union Ministry of Health and Family. The prevalence of blindness and visual impairment is one of the highest in Telangana, a state in Southern India, as inferred from the survey. [2] The significant reasons indicated in the survey were due to cataract and refractive [3].

All surveys in the country have shown that cataract is the most common cause of blindness and all prevention of blindness programs have been “cataract-oriented.” However, it has recently been recognized that the visual outcome of the cataract surgeries as well as the training of ophthalmologists has been less than ideal.

This study uses Artificial Intelligence (AI) and machine learning techniques to explore a dataset containing information on 873,448 patients who visited LV Prasad Eye Institute (LVPEI), a multi-tier ophthalmology hospital network, based in Hyderabad. The data was extracted from EyeSmart, the hospital’s Electronic Medical Record (EMR) and health management system, and then merged with climatic factors to test the correlation between climatic variables and ocular diseases presented by the patients [4].

In healthcare, ophthalmology deals with the diagnosis and treatment of eye disorders. Some known diseases in ophthalmology are cataracts, retinal disorders, macular degeneration, and others. The relatively rapid and recent adoption of EMRs in ophthalmology has been associated with the promise that the accumulation of large volumes of clinical data would facilitate quality improvement and help answer a variety of research questions. Given that EMRs are relatively new in most practices and that clinical data are inherently more complex than other fields that have been altered by the digital revolution, these proposed benefits have yet to be realized [5].

With the rise of big data, it has now become easier to study how culture, race, climate, and other socio-demographic factors correlate to the spread of ocular diseases. Studying risk factors, primarily associated with climate and the environment can lead to a better understanding of the causes, diagnosis and treatment of several eye diseases. [6] This has shed light on recent research in medicine and ophthalmology.

In Section 2 of this paper, we discuss and highlight different statistical tools and methods used to investigate the study of several demographic and climatic factors that impact individuals in Telangana. While Section 3 focuses on providing a thorough analysis of the data findings, Section 4 highlights key findings and trends, and Section 5 includes conclusion and recommendations pertaining to the use of big data and data merging in EMR to reveal new insights in the study of visual impairment and eye disorders in Telangana. Visual impairment has continually exhibited an escalating trend in underdeveloped countries over the past years, and in India, the burden of visual impairment is high in urban and rural areas. Eye-care services should be accessible and affordable to individuals in need. This study intends to discover how socio-demographic and climatic factors correlate to the number of individuals affected using Artificial Intelligence tools.


The To gain insight into the climatic and socio-demographic factors that correlate to the risk of ocular diseases in the State of Telangana, we used multiple approaches utilizing AI and statistical software and programming languages, including Microsoft Power BI and Python to explore the dataset, which contained information on 873,448 patients complaining of eye disorder symptoms across multiple categories of ocular diseases. Python was used to merge the datasets using Pandas library through a column mutation process. This tool has the advantage of handling large and complex datasets. The process was however timely given the large volume of patients. The process was repeated more than once to ensure minimal error in the merging process. Microsoft Power BI was then used to model the data in order to obtain the visualizations and insights.

The dataset, which covers clinical visits are from the year 2011-2019, includes demographic information of the patient, including age, gender, profession, date of visit, district of resident, among other variables. To look further into this issue, we searched for other variables that may have a relationship with eye disorders and focused on climate.

The climate variables we examined were average temperature, minimum temperature, maximum temperature, humidity, rainfall, and solar radiation. This data was retrieved from the Telangana State Development Planning Society in the state of Telangana.[7] The findings that relate to temperature and its effect on cataract in older age was consistent in high and low temperatures.

In addition, this research was undertaken to gain insight into how data merging techniques can transform a complex data set into a simpler format for analysis to gain new insight about the development of cataract in patients in Telangana, and hence we used the Cynefin framework as a guide to achieving a more structured dataset. The scheme of Cynefin framework acts as a guide to healthcare practitioners and researchers because of its foundation in the management of information. [8]This particular tool was developed with an aim to offer support and right direction in the process of decision making for situations where the existing intricacy within the outcomes affect the nature of knowledge, forecast, and choice. [9] It has varied domains which necessitate different actions, for instance, the straight forward and complex context is considered equivalent to an “ordered state” of the universe which can be interpreted based on the causal and effect association of the facts or findings, and therefore the right direction or pattern can be decoded. [5] However, in the case of “complicated or chaotic” data, where researchers or healthcare practitioners are unable to formulate a definitive cause and effect association, there is no such immediate conceivable relationship, thus, the Cynefin framework guides professionals to choose the right direction based on the “emerging patterns” [5] This means that the chaotic or unordered state of the world requires pattern dependent management for proper direction and right decision making. [5]

Varied Types of Medical Complexities

Cynefin framework and its application in medical complexities

According to scientific evidence the parts or the components of the complex system may not show direct association in a linear pattern. These factors or parameters are openly prevailing within the environment due to which the interactions between those factors can occur at varying levels via recursive feedback loops. [9]

Cause and Effect Interpretation Using Cynefin Framework

 Cause and effect association of Cynefin framework in medical complexities

The above figure refers to the cause and effect association between the parameters decoded with the aid of Cynefin framework. In the figure, the square block indicates the “causal agent”, and the dots indicate the “effect agent”. The proper lines refer to the direct association in between the two agents whereas the dotted lines refer to the weak or probable association in between the two agents. When the relation or the conduct of the components of the complex adaptable systems cannot be perceived in direct terms, it is said to be emergent and active in nature. Moreover, these factors again show alterations with time and pressure of the surrounding environment due to which it develops into a new form. [9]

The concept of medicine is complex and the complexity that arises at the time of general practice varies widely.[10] The approach of general practice formulates expertise based on the overall profile of the patients. In this regard, the theory of complexity demonstrates that these associations between varied confounding factors are complementary in nature.[10] The clinical specialist has to deal with the complex domain frequently during the practice. It is evident that detailed analysis using statistical or analytical tools (AI) along with investigation and specialized knowledge of the specialist helps to provide excellent solutions to complicated problems, for instance, cataract problems and factors that lead to the development of cataracts.[5] The clinical specialist often finds problems within the disordered or chaotic zone as the data do not show any direct or known association between the varied agents, however, with the help of taking the Cynefin Framework as a tool to convert complex data to simpler data, practitioners or researchers can recognize the probable underlying cause which would be of immense help in the field of healthcare.

Snowden, the developer of the Cynefin Framework, highlighted that the concept of “best practice” is attributable to identifiable or known problems; for the complicated situations the “good practice” is advisable and for the complex problems the “emergent practice” is considered to be the most suitable.[5] Therefore, it is scientifically evident that with the use of Cynefin Framework, the complex or chaotic data which the research started with can be successfully converted to simpler data to find an emergent association between the co-factors that cause the development of cataract, referred to as “wisdom” in the DIKW pyramid, and ultimately will bring awareness among the community.[5]

Analysis: Findings and Trends

It has proven valuable to first observe which diseases are the most prevalent in the different areas of Telangana, and what age and gender are most affected to get a full understanding of the criticality of the eye disorder epidemic and to provide a baseline against which to compare the climate and economic variables examined. This section highlights key findings of the study, as well as trends in relation to the subject matter as per the demographic and climatic variables tested.

The use of EMRs in generating new insights has been an increasing trend in the area of ophthalmology. Research in ophthalmology has benefited greatly from the use of EMRs in expanding the breadth of knowledge in areas such as disease surveillance, health services utilizations and outcomes. In addition, the quantity of data available has increased, that it is now highly recommended to work on data linkage systems in eye research, as such data can offer insights into advantages and limitations for future direction in eye research. [8]

The timespan of this dataset is between 2011-2019, a total of 8 years. There has been consistency to already known information through the analysis, specifically on gender and age-related eye conditions. Creative featuring techniques to minimize the number of variables have been helpful throughout the analysis as it can shed light on which patterns are more significant.

A. Gender and Eye Disorders
 Clinical Visits by Gender

Figure 3 indicates that between the years 2011-2019, 53% of the patients were male patients who were seen for eye disorders, and 47% were female patients. This finding is in line with the gender study that was conducted on 2.3 million patients of all those who presented to the LV Prasad Eye Institute from the years 2011-2019.[4] Globally, one of the social determinants of health that has been universally identified is gender. In India, health inequalities between men and women have played a pivotal role in disease development, including eye disorders. With respect to eye care, women have been generally cited for having higher rates of blindness in India and being less likely to access appropriate eye services.[11] However, as we can see from the study which was focused on Telangana, this is not the case, as male patients exceeded female patients, and this could be for the reason that Telangana has been ranked as one of the top ten innovative and developed states in India according to the India Innovation Index 2019 where access to healthcare is available and appreciated by both male and female.

Order Now

India has been one of the countries where efforts to strengthen the evidence-base for blindness control has received significant attention from policy planners and program managers. Over the past four decades, a series of population-based blindness and visual impairment surveys have been undertaken in India, using different survey methods. This included detailed eye examination surveys as well as rapid assessments.[12]

B. Occupation and Clinical Visits
Clinical Visits by Profession

In addition, when studying the correlation between profession and clinical visits, it appeared that homemakers, employees in the government and private sectors, and students make the top three categories of those who are most affected. Figure 4 depicts this analysis and portrays the top six professions taken from the analysis. We can also see that workers in Agriculture and manual laborers tend to present themselves with eye disorders as well, and that could be to the nature of the job, in which they are exposed to certain chemicals, dust, and usually work in heated environments. Recent estimates from the World Health Organization indicate that 90 per cent of all those affected by visual impairment live in the poorest countries of the world.[5] India is home to one‐fifth of the world's visually impaired people and therefore, any strategies to combat avoidable blindness must take into account the socio‐economic conditions within which people live.[5]

Home-makers could also translate to housewives, who are at higher-risk of visual disorders, and this is in line with a study that was conducted in 2009 on women in Indian culture, where it showed that housewives are more likely to suffer from heart diseases than working women, and that is due to lack of education, lifestyle that is based on obesity and cultural myths that do not focus on women’s health. Having a similar study related to eye disorders and visual impairment, as per the study based on the sample of the population from Telangana, the same pattern can be seen and it can potentially be from these similar reasons. [10]

C. Location and Eye Disorders in Older Age (41-70)
Location and Eye Disorders in Older Age Population

Cataract seems to be the leading ocular disease that has affected older age in Telaengana, which is the clouding of the natural human lens. Cataract is a condition known to affect older age, and this study revalidates the information.

D. Location and Eye Disorders in Younger Age (11-20)
Location and Eye Disorders in Younger Age Population

Astigmatism, which is an irregularity of the shape of the cornea was present in the younger age population. Astigmatism has been linked to being a hereditary condition in ophthalmology.

In both contexts, it appeared to be that eye disorders are mostly concentrated in residents from the district of Paloncha, and even though this district has a higher literacy rate than state average is 77%, 10% higher than that of the state average which is at 67%, it has been reported that it has been hit with pollution and contaminated water in 2015. The state-run thermal power plant installed in 2015 caused pollution and health disorders including eye disorders.[13] Residents complained of gray water, and doctors in Paloncha confirmed that the prolonged exposure to air and water pollution has led to higher incidences of respiratory diseases, tuberculosis, skin diseases, blurring of vision, and irritation in the eyes. [13]

Consistent Prevalence of Cataract in Rainfall
Cataract Most Prevalent in Rainfall

Globally, cataract is the single most important cause of blindness, and the second most common cause of moderate and severe vision impairment (MSVI) according to the Global Burden of Disease, Injuries and Risk Factors Study, and it is most predominant in Southeast Asia. Cataract contributed to a worldwide 33.4% of all blindness and 18.4% of all MSVI. Translating the same into actual numbers, cataract caused blindness in 10.8 million of overall 32.4 million blind and visual impairment in 35.1 million of 191 million visually impaired individuals.[12]

The close relationship between climate, environment and the development of cataract is crucial to understand for future preventative measures. In Telangana, it shows that cataract is the disease most prevalent in rainfall.

E. Consistent Prevalence of Pterygium in Relation to Global Radiation
Pterygium in Relation to Global Radiation

We analyzed patients who presented with Degeneration symptoms, and correlated the diagnosis to climatic factors, such as humidity, rainfall, temperature and global radiation. The above analysis shows the top 5 most prevalent degeneration right-eye diseases as impacted by global radiation. Pterygium shows to be most pervasive at over 46% of the total global radiation value. The analysis was done on a patient basis and not a disease basis, as the data showed that one patient can develop more than one disease.

F. Consistent Prevalence of Pterygium in Relation to Windspeed
 Pterygium in Relation to Windspeed

The analysis above shows the top 5 most prevalent right-eye diseases with Degeneration as a symptom and how the diseases are influenced by maximum windspeed. Pterygium also was also the most present among patients and concentrated at average maximum windspeed of between 10.2 and 10.9.

G. Breakdown of Cataract by Gender
Breakdown of cataract by patients by gender

Identifying cataract patients came to 102,509 patients from the years (2011-2019). Therefore, the sample study of patients is based on 102, 509 patients who presented themselves to LVPEI during that timeframe from the state of Telangana. The figure above shows the gender composition of cataract patients. There is a total of 48,219 (47.04%) who are male, while 54,290 (52.96%) are female.

H. Breakdown of Cataract Patients by District
Breakdown of cataract patients by district

The above figure shows the breakdown of the number of cataract patients across the districts – urban, rural, and metropolitan. It is clear that cataract is most prevalent in rural areas with 58,123 cases of cataract recorded. This is followed by metropolitan with 27,150 cases of cataract recorded while the urban district has 17,236 cases of cataract recorded. Eye diseases worsen the quality of life and satisfaction of an individual.[14] It has been observed that the spatial location, the cultural and financial condition of the individuals and most importantly the access rate to healthcare organization are considered to be the perceived barriers. It is scientifically evident that cataract is considered to be one of the potential causes behind the visual impairment or blindness mostly among the middle and poor resourced families.[15]The varied reasons behind these variations might be attributed to the spatial location which creates troubles to access the distant health care centers especially for female patients. [15]Moreover, the awareness about eye diseases and the financial crisis plays the most significant roles for the worsening condition of cataract among the rural populace. The studies conducted by both Joshi and Murthy et al. have supported this opinion and highlighted the expanded charges of the modernized strategies involved for cataracts such as intraocular lens implantation. Another important reason behind the rising number of cases in the rural areas is due to their mindset to utilize the traditional eye medicines, which demonstrated contradictory outcomes due to the toxicity and infections caused by the agents.

Conclusion and Recommendation

This data analytics study provides an expanded exploration of how socio-demographic and climatic factors affect the prevalence of visual impairment and eye disorders in Telangana. Applying several statistical techniques, including pattern recognition, and generating other data visualizations, we were able to validate previously identified findings of gender’s relation to eye disorders in Telangana. We found the tools we used to be very useful for discovery research to better understand the sample set of patients and to generate informative and understandable visuals.

Big data can serve to boost the applicability of clinical research studies into real-world scenarios, where population, race, and climate create a challenge. It equally provides the opportunity to enable effective and precision medicine by performing patient stratification. This is indeed a key task toward personalized healthcare. A better use of medical resources by means of personalization can lead to well-managed health services that can overcome the challenges of a diverse population where poverty is high. Thus, creative featuring and data merging for health management of EMRs can have an impact on future clinical research.

From a systems perspective, we observe that a patient is influenced by several co-factors that result in the development of eye disorders, and that is significant in studying patient care from a holistic standpoint. AI tools create the pathway to merging publicly available data and aligning multiple variables as part of the overall influence. This technique is widely applied in decision-making and outcome assessment for an enhanced healthcare experience, in which modeling knowledge and expert experience are studied more thoroughly for new pattern recognition. However, variables must be minimized in order to capture the underlying knowledge, or otherwise patterns will be harder to spot. Thus, we attempt to apply this in the future with fewer variables to overcome the challenges in the first phase of data merging.

We recommend that the authorities spend more time

and funds on creating awareness to educate individuals and families about the visual impairment crisis in Telangana. Creation of awareness is one of the most comprehensive approaches to sensitize communities concerning the consequences of eye disorders, but also one of the avenues to equip individuals with knowledge, skills and correct attitudes towards a healthier lifestyle.

Besides the creation of awareness, this study also recommends ophthalmologists’ understanding of all factors that influence a disease other than medical history, and to look at each patient uniquely in terms of social income, cultural upbringing and offer a more individualistic approach in educating a patient from the criticality of self-care, to help patients deviate away from high-risk situations that can cause eye disorders, and to find ways from an earlier age for more effective preventative results that can reduce the number of affected individuals with vision impairment in Telangana.


  1. Das AV, Kammari P, Vadapalli R, Basu S. Big data and the eyeSmart electronic medical record system - An 8-year experience from a three-tier eye care network in India. Indian J Ophthalmol [serial online] 2020 [cited 2020 Mar 2];68:427-32.
  2. Kaur, 2019. https://www.downtoearth.org.in/news/health/cataract-top-cause-of-blindness-in-india-finds-survey-67187
  3. Express News Service, October 2019. Prevalence of Blindness, Visual Impairment High in Telengana.https://www.newindianexpress.com/cities/hyderabad/2019/oct/11/prevalence-of-blindness-visual-impairment-high-in-telangana-2045886.html
  4. Srinivas Marmamula, Rohit C Khanna, Gullapalli N Rao Int JOphthalmol. 2016 [serial online]https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4886893/
  5. Gray, B., 2017. The Cynefin framework: applying an understanding of complexity to medicine. Journal of Primary Health Care, 9(4), pp.258-261.
  6. Boland, M. V. (2016). Big data, big challenges. Ophthalmology, 123(1), 7-8.
  7. Telangana State Development Planning Society. [online] https://tsdps.telangana.gov.in/ last accessed [11 August, 2020]
  8. Snowden, D.J. and Boone, M.E., 2007. A leader's framework for decision making. Harvard Business Review, 85(11), p.68.
  9. Kempermann, G., 2017. Cynefin as reference framework to facilitate insight and decision-making in complex contexts of biomedical research. Frontiers in Neuroscience, 11, p.634.
  10. Sturmberg, J.P. and Martin, C.M., 2009. Complexity and health–yesterday’s traditions, tomorrow’s future. J Eval Clin Pract, 15(3), pp.543-548.
  11. Mathews D. How gender influences health inequalities. Nurs Times 2015; 111: 21–23.
  12. Kaur, K., 2018. Cataract Blindness: Socioeconomic Factors Associated with Treatment Barriers and High Blindness Rates for Women in Rural Regions of Andhra Pradesh
  13. Suchitra, 2015. Stream of Ash. Down to Earth https://www.downtoearth.org.in/coverage/stream-of-ash-4403 6
  14. Joshi, M.V., 2015. Epidemiological study of patients availing free cataract services of national programme of control of blindness. Journal of Clinical Ophthalmology and Research, 3(1), p.9.
  15. Murthy, G.V., Jain, B.K., Shamanna, B.R. and Subramanyam, D., 2014. Improving cataract services in the Indian context. Community Eye Health, 27(85), p.4.


Author Name: Position in Organization, Division in Organization, Affiliated Organization, City, State, Country

Author Name: Position in Organization, Division in Organization, Affiliated Organization, City, State, Country. For multiple authors, follow the same format. Honorifics can be included in this section. Please do not include honorifics on the first page of the journal article.

Google Review

What Makes Us Unique

  • 24/7 Customer Support
  • 100% Customer Satisfaction
  • No Privacy Violation
  • Quick Services
  • Subject Experts

Research Proposal Samples

It is observed that students are stressed when completing their research proposal. Now, they are fine as they are aware of the Dissertation Proposal, which provides the best and highest-quality Dissertation Services to the students. All the Literature Review Example and Research Proposal Samples can be accessed by the students quickly at very minimal value. You can place your order and experience amazing services.

DISCLAIMER : The research proposal samples uploaded on our website are open for your examination, offering a glimpse into the outstanding work provided by our skilled writers. These samples underscore the notable proficiency and expertise showcased by our team in creating exemplary research proposal examples. Utilise these samples as valuable tools to enhance your understanding and elevate your overall learning experience.

Live Chat with Humans
Dissertation Help Writing Service