Data Science Research Group

Universidad Autónoma de Sinaloa

Parque de Innovación Tecnológica
Josefa Ortíz de Dominguez S/N Ciudad Universitaria, C.P. 80013.
Culiacán, Sinaloa, México.

The DSRG was created in 2015 with the main goal of doing research related to high performance and high scale data analysis. This includes storage, management, processing, and knowledge discovery.

Since its creation, this group has been open to interdisciplinary, collaborative, work. As a result, in 2016 the DSRG was part of the creation of the bioinformatics unit for the National Laboratory for the Research on Food Safety (LANIIA). At LANIIA, we help molecular biologist to develop and use analytical tools for the characterization of food pathogens using comparative genomics.

We promote interdisciplinary research using data science tools. Currently, we have ongoing collaborations with the departments of chemistry and pharmacology, plant sciences, accounting, and finance. We receive students from different disciplines and train them in data science. For this, we organize regular workshops and invite students to participate on our ongoing projects.


Research

At the DSRG, we are interested in the following research topics.

  • Data Science and Machine Learning
  • Large-Scale Data Management and Processing
  • High Performance Data Analysis
  • Deep Learning
  • Bioinformatics and Computational Biology

Projects

The DSRG members are involved in several projects, most of them in collaboration with other academic institutions, the Mexican government, and the private sector. A list of currently ongoing and recent projects is given below.

Using Deep Learning for the Identification of Plant Species in the Mexican Flora from Images

Funded by the Mexican Council for Science and Technology (CONACyT). Co-PI. 2018 – 2020.

Deep Neural Networks for the Classification of “Street-view” Images

Funded by the Mexican Council for Science and Technology (CONACyT). 2018 – 2020

Laboratorio Nacional para la Investigacion en Inocuidad Alimentaria (LANIIA) - The National Laboratory for the Research on Food Safety-: Bioinformatics Unit.

Funded by the Mexican Council for Science and Technology (CONACyT). 2016 – 2018.

Software

R package: glm.deploy

'C' and 'Java' Source Code Generator for Fitted Glm Objects

This is an open source project for an R package already available on CRAN. This package provides two functions that generate source code to deploy Generalized Linear Models (GLM) by generating source code in C and Java. Link to CRAN: glm.deploy


Recent Publications

Whole-genome sequencing of Staphylococcus aureus L401, a mecA-negative community-associated methicillin-resistant strain isolated from a healthy carrier

Maria E. Baez-Flores, Jose A. Magaña-Lizarraga, Jesus R. Parra-Unda, Yesmi P. Ahumada-Santos, Magdalena J. Uribe-Beltran, Bruno Gomez-Gil, Ines F. Vega-Lopez, and Rogelio Prieto-Alvarado. 2019. Whole-genome sequencing of Staphylococcus aureus L401, a mecA-negative community-associated methicillin-resistant strain isolated from a healthy carrier. Journal of Global Antimicrobial Resistance.

To appear
Next-Generation Heartbeat Classification with a Column-Store DBMS and UDFs

Oscar Castro-Lopez, Daniel E. Lopez-Barron, and Ines F. Vega-Lopez. 2019. Next-Generation Heartbeat Classification with a Column-Store DBMS and UDFs. Journal of Intelligent Information Systems. Springer. DOI 10.1007/s10844-019-00557-w.

Link
April 2019
Multi-target compiler for the deployment of machine learning models

Oscar Castro-Lopez and Ines F. Vega-Lopez. 2019. Multi-target compiler for the deployment of machine learning models. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2019). IEEE Press Piscataway, NJ, USA, 280-281.

Link
February 2019
Fast deployment and scoring of support vector machine models in CPU and GPU

Oscar Castro-Lopez and Ines F. Vega-Lopez. 2018. Fast deployment and scoring of support vector machine models in CPU and GPU. In Proceedings of the 1st International Workshop on Machine Learning and Software Engineering in Symbiosis (MASES 2018). ACM, New York, NY, USA, 45-52. DOI: https://doi.org/10.1145/3243127.3243133

Link
September 2018
ML2ESC: A Source Code Generator to Embed Machine Learning Models in Production Environments

Oscar Castro-Lopez and Ines F. Vega-Lopez. 2018. ML2ESC: A Source Code Generator to Embed Machine Learning Models in Production Environments. In Proceedings of the 14th International Conference on Data Science (ICDATA 2018). SCREA

Link
August 2018

People

The DSRG is comprised by faculty members, students and technical staff. Please click on the following links to find out about each group member and their work.


Collaborate

We promote interdisciplinary research using data science tools. We always welcome faculty willing to explore this new interdisciplinary field.

We are constantly on the look for highly motivated students who would like to receive training in data science and want to take advantage of a hands-on experience by collaborating with us. We can offer scholarships for either master or doctoral students at the information science graduate program. Postdocs are also welcome and we can usually help them get a scholarship from the Mexican Council for Science and Technology (CONACyT).

If you are interested, please contact Ines F. Vega-Lopez.