Software in the field of genomics has become an integral part of the study of science and is an important element in increasing productivity. A well-designed application uses fewer resources of the working computer, and relieves the researcher’s burden, saving time and increasing work efficiency.
However, the trend shows that such software is developed under low funding conditions, usually by academic staff without the involvement of highly qualified programmer-engineers. These solutions are very poorly supported later on and are plagued with lacklustre customer feedback, leaving us with unreliable software, that is difficult to use, and works with errors.
The problem is becoming more and more pressing, as the amount of genetic data stored, analysed and studied is constantly increasing. Such software is developed at the expense of state support, with a large number of conditions and restrictions. With that in mind, let’s try to figure out what types of software can be found on the market and what tasks they are used to solve.
These programs use information about the activity of certain genes to figure out pathways to regulate the work of our genome. They enable tracking various patterns, such as two genes having the same degree of activity in different conditions to draw a conclusion about their interaction with each other.
Take gene interaction estimation as an example. There are many algorithms for estimating gene interactions, however, the most common ones are not always the best, so the direction of developing an application with a complete database of algorithms, with their selection and application in two clicks, is a promising direction. With a plethora of various algorithms in this area, AI/ML algorithms look promising in this area, however, they are difficult to develop, and require enhanced control during validation.
Other tools predict health risks, like software that calculates an individual’s risk level for breast or ovarian cancer based on family history and other data. It is this kind of software that can suggest recommendations for lifestyle adjustments that are presented to people on platforms like 23andme, Orig3n, and others. At the moment, the functionality of these applications is limited to displaying general interpreted information. But imagine having an opportunity to take a closer look at the relationships between your genes, or a detailed map of the your genome? With technological advancements in the imminent future, more and more insight will be available about the same genome stored on such a platform.
As a side note, the emergence of such applications, and their increasing adoption, will lead to standardisation, which means simplification of genetic testing, data transfer and increased research productivity. However, for the time being, it’s a free-for-all.
Large genetic data storage
The first challenge is to store humongous amounts of genetic data. The second challenge is to have an ability to efficiently retrieve information from these data dumps.
It is becoming essential to design and implement capable next-generation archiving software. The mass nature of genetic testing implies the creation of large archives which require novel indexing approaches for faster subsequent retrieval. Next-gen genetic storage and retrieval systems should allow you to quickly find, export and send files with a volume of several hundred s of gigabytes. Keep in mind that the raw data of the decoded human genome can take up to 600 GB. In general, the specifics of this application will fully comply with the principles of big data management applications.
Modelling and simulation tools
In some cases, it is not enough to have a classical, queryable, database at hand, or algorithms for tracking gene interactions, this is especially true for new, unexplored mutations.
Modelling the structure of a protein based on the structure of a gene code is a very promising area of development. Knowing the structure of the protein, it is possible to draw a conclusion about the possible violations of its functions in relation to the norm.
With that in mind, it is quite possible to develop a system that enables predicting the phenotype (appearance) depending on the genotype. It is hard to overestimate the importance of such a working tool in forensics or in family planning and artificial insemination. Of course, we’re still at least several years away from such software, and a lot of work should be done.
The road ahead
An actual problem is also the construction of a universal system which enables storing patient data, studying mutations, making predictions, and quickly filtering out unnecessary information.
Imagine a software that is a hybrid of LIMS, a doctor’s notebook, a genetic browser, and all this works in cooperation with the largest scientific databases that collect information and study the pathogenicity, prevalence and relationship of mutations.
At HMND we have been contributing to the development of such a universal application which allows a complete analysis of all the patient’s mutations without switching between countless browser pages and without opening a huge number of highly specialised programs.
Two main tasks that need to be solved when developing such an application:
- Establish a reliable feedback flow with the client
- Stay open to fresh ideas that can significantly improve thew workflow, as the science matures
Of course, aside from technical challenges, the main hurdle when creating such software is compliance. Software that is recognised as “medical” can be vetted and approved for diagnosis in one country, but may be suitable only for non-professional use in another, and be completely banned in a third.
As a result, it is worth mentioning the undeniable prospect of developing software for genetic research, since now this niche is often occupied by low-quality programs that have become too outdated, too specialised, too buggy.