I am a recent PhD graduate from the CSIR-Institute of Genomics and Integrative Biology in New Delhi, India, where I conducted research under the guidance of Dr. Debasis Dash. My doctoral work focused on the identification of human proteoforms using mass spectrometry techniques. I am deeply passionate about unraveling the role of tissue-specific proteoforms in human biology. Prior to pursuing my PhD, I earned my Master's degree in Bioinformatics from Pondicherry University in Puducherry, India. I completed my Bachelor's degree in Science from Dyal Singh College under Delhi University in New Delhi. I spent my formative years in Arrah, Bihar, known for its rich cultural heritage and the vibrant "Bhojpuri" language. Beyond academia, I am an enthusiastic traveler and photographer; you can find a showcase of my work on my Instagram profiles. In my leisure time, I enjoy planning trips, reading books, watching movies, or simply indulging in some much-needed rest.
My research revolves around making sense of proteoforms with mass spectrometry datasets with respect to the following directions:
Identified proteoforms in different brain human tissues. We have developed a novel computational framework (PgxPlus) to identify proteoforms that are often missed in conventional proteomic/proteogenomic identification by adapting the search database to account for tissue-specific variations.
Developed scalable computational and statistical tool, PgxSAVy for the analysis of variant peptides identified through mass spectrometry datasets. Our aim is to streamline the analysis process and improve the accuracy of peptide identification in proteomic studies.
Exploring the development of small tools to facilitate the handling of big data. We are focused on creating user-friendly applications that simplify the analysis and interpretation of complex datasets, ultimately enhancing research productivity and efficiency.
Additionally, I have conducting genomic analysis on whole-genome sequencing (WGS) and whole-exome sequencing (WES) data. I have also developed various pipelines for the analysis of other omics data, contributing to a comprehensive understanding of biological systems at multiple levels.
I specialize in the fascinating world of big data analysis and data visualization, where I explore the intricate details hidden within massive datasets. My current focus revolves around the captivating realms of proteomics, genomics and proteogenomics, where I delve into the mysteries of human biology at the molecular level. Armed with a versatile arsenal of programming tools, I wield the power of Perl, R, and Python to unravel the complexities of data analysis. R serves as my trusty companion for data visualization and statistical computing, empowering me to transform raw data into captivating visual narratives that unveil profound insights. Venturing into the realms of operating systems, I traverse the landscapes of Linux and Windows with ease, navigating the digital terrain with finesse. Through the art of automation, I orchestrate seamless workflows using shell scripting, Snakemake and nextflow, while harnessing the untapped potential of parallel computing with the mighty Slurm. As an ardent advocate of the open-source ethos, I embrace the collaborative spirit of the community, championing the principles of transparency, accessibility, and innovation.
My github repositories are available at https://github.com/anuragraj/.
I have also ventured into the realm of artistic expression through the medium of R, where I craft captivating visual masterpieces that blend the beauty of data with the creativity of design. These artistic creations, lovingly dubbed "aRt Works," are born from the fusion of code and imagination, with each stroke of the virtual brush bringing data to life in stunning detail. Drawing inspiration from various internet sources and leveraging the collective wisdom of the R community, I breathe life into my creations, infusing them with a unique blend of aesthetics and insight. Through the magic of code, I transform raw data into visual narratives that captivate the eye and inspire the mind.
Raj, A., Aggarwal, S., Singh, P., Yadav, A. K., & Dash, D. (2023). PgxSAVy: a tool for comprehensive evaluation of variant peptide quality in proteogenomics – catching the (un)usual suspects. Computational and Structural Biotechnology Journal, 2023-12. doi: https://doi.org/10.1016/j.csbj.2023.12.033
Raj, A., Aggarwal, S., Kumar, D., Yadav, A. K., & Dash, D. (2023). Proteogenomics 101: a primer on database search strategies. Journal of Proteins and Proteomics, 14, 287–301. doi: https://doi.org/10.1007/s42485-023-00118-4
Raj, A., Aggarwal, S., Yadav, A. K., & Dash, D. (2023). Quality control of variant peptides identified through proteogenomics- catching the (un)usual suspects. bioRxiv, 2023-05. doi: https://doi.org/10.1101/2023.05.31.542998
Aggarwal, S., Raj, A., Kumar, D., Dash, D., & Yadav, A. K. (2022). False discovery rate: the Achilles' heel of proteogenomics. Briefings in bioinformatics, bbac163. Advance online publication. doi: https://doi.org/10.1093/bib/bbac163 PMID: 35534181
Bhaskar, A.K., Naushin, S., Ray, A., Singh, P., Raj, A., Pradhan, S., Adlakha, K., Siddiqua, T.J., Malakar, D., Dash, D., & Sengupta, S. (2022). A High Throughput Lipidomics Method Using Scheduled Multiple Reaction Monitoring. Biomolecules. 12: 709. doi: https://doi.org/10.3390/biom12050709
Dasgupta, A., Chakraborty, R., Saha, B., Suri, H., Singh, P., Raj, A., Taneja, B., Dash, D., Sengupta, S., & Agrawal, A. (2021). Sputum Protein Biomarkers in Airway Diseases: A Pilot Study. International journal of chronic obstructive pulmonary disease, 16, 2203–2215. doi: https://doi.org/10.2147/COPD.S306035
Singh, P., Chakraborty, R., Marwal, R., Radhakrishan, V. S., Bhaskar, A. K., Vashisht, H., Dhar, M. S., Pradhan, S., Ranjan, G., Imran, M., Raj, A., Sharma, U., Singh, P., Lall, H., Dutta, M., Garg, P., Ray, A., Dash, D., Sivasubbu, S., Gogia, H., … Sengupta, S. (2020). A rapid and sensitive method to detect SARS-CoV-2 virus using targeted-mass spectrometry. Journal of proteins and proteomics, 1–7. doi: https://doi.org/10.1007/s42485-020-00044-9 PMCID: PMC7457902. PMID:33132628.
Mehani, B., Narta, K., Paul, D., Raj, A., Kumar, D., Sharma, A., Kaurani, L., Nayak, S., Dash, D., Suri, A., Sarkar, C., & Mukhopadhyay, A. (2020). Fusion transcripts in normal human cortex increase with age and show distinct genomic features for single cells and tissues. Scientific reports, 10(1), 1368. doi: https://doi.org/10.1038/s41598-020-58165-6
Full list of publications on Google Scholar
Research profile on ResearchGate