Wenqi Jiang

Software Engineer

About Me

My name is Wenqi Jiang. I’m a software engineer focusing on big data infrastructure and backend services, specializing in scalability, high availability, and low latency distributed systems design and implementation. I am proficient in Java and Scala and I have studied React and Flutter to find interesting challenges. Furthermore, I enjoy powerlifting and traveling in my spare time.

Experience and Projects

Ontology Engineer (Scholarship)

Ontology Engineering Group

Feb. 2022 - Jul. 2022 · 6 mos

oeg.fi.upm.es

A student scholarship for the development of linking algorithms and graph linkers.

  • Analyzed ontologies from different data sources, like DBpedia, Wikidata;
  • Specialized 7 linking discovery algorithms, including string-based similarity and geometry relationships;
  • Implemented a Sparql-based linking application by using Apache Jena, and evaluated it with the OAEI-2021 datasets;
  • Written the master’s thesis.

Keywords: Ontology Graph Linker RDF Text Similarity DE-9IM WKT SPARQL Apache Jena OAEI

Big Data Engineer

Shenzhen Togic Software Technology Co.

May 2019 - Aug. 2021 · 2 yrs 4 mos

51togic.com

Shenzhen Togic Software Technology Co. was established in 2010. The company focuses on home Internet TV clients, with its Togic (similar to Netflix) and WEBOX (similar to Apple TV) delivering a simple, intuitive, and lightweight experience for family entertainment, quickly becoming the most popular app on TV systems. There are currently near 500,000 daily users on this platform, and over 9 million customers worldwide used these products.

Offline Data Warehouse

  • Maintained the Hadoop cluster containing over 200 TB of data, including user and system logs from 3 different sources;
  • Upgraded big data components, including Hadoop, Spark, Kafka, etc., and monitored them on CDH;
  • Designed and implemented a four-layer data warehouse based on the STAR model by using Hive, which had near 150 tables and received over 300 GB per day;
  • Rewrote the ETL pipeline, which saved over 80% of storage and fixed the small files’ problem on HDFS.

Keywords: CDH Kafka Hadoop HIVE Spark


Data Statistics System

  • Restructured the statistics system based on the new offline data warehouse;
  • Rewrote the code by using Spark DSL, making it more readable and maintainable;
  • Optimized the system’s performance, reduced the process time by 25% compared to the previous version;
  • Implemented more than 30 new statistical demands and fixed bugs.

Keywords: Sqoop Hive Spark Scala Elasticsearch MySQL MongoDB Python


Log Collection Service

  • Designed and implemented the log collection pipeline using Nginx and Netty, improving performance, stability, and scalability;
  • Processed all logs into the Kafka cluster, which has 5 brokers, 3 replications, and 7 days of redundancy, improving robustness.

Keywords: Nginx Netty Java Kafka HDFS

Backend Engineer

Beijing Webstudio Information Technology Co.

Jul. 2018 - Apr. 2019 · 10 mos

wbdatavis.com

Beijing Webstudio Information Technology Co. was founded in 2007 and focuses on quality-based and innovative development, providing data analysis and visualization, as well as official website marketing solutions which span a wide range of business disciplines and application situations.

  • Implemented backend APIs of big data visualization dashboards and general reports for the BI applications;
  • Developed more than 25 data operators, such as summation, date conversion, string processing etc.;
  • Implemented the one-click function of syncing all user information;
  • Expanded 16 data sources, including MySQL, SQL Server, MongoDB etc.;
  • Maintained the big data cluster;
  • Fixed bugs and developed new features for the previous company’s projects.

Keywords: Dubbo Spring Boot MyBatis Hibernate Java Nginx MySQL Redis Docker FastDFS CDH Hadoop HBase Impala Spark ML

Software Engineer (Internship)

Knowledge Engineering Group

Nov. 2017 - Jun. 2018 · 8 mos

keg.cs.tsinghua.edu.cn

The Knowledge Engineering Group (KEG) of Tsinghua University established in 1996, and devotes to research on social network analysis, news mining, semantic web, knowledge graph construction, etc.

  • Organized and annotated critical illness insurance documents;
  • Trained models to extract the logical structure of insurance documents by using GROBID;
  • Created a desktop application using above models;
  • Maintained datasets in NoSQL database, like MongoDB, Neo4j;
  • Completed the thesis.

Keywords: NLP CRF GROBID Neo4j MongoDB Java FX

Open Source

Dapp-Learning

Apr. 2022 - Present

github.com/Dapp-Learning-DAO

Definitive Guide for Decentralized-app(Dapp) Development on Blockchain. Step-by-step Dapp practice through actual projects.

  • Contributed various tutorial tasks;
  • Translated the basic tasks;
  • Shared the latest blockchain news and cutting-edge technics in the community.

Keywords: Blockchain web3.js Solidity Dapp Ethereum Hardhat Alchemy Infura DAO.

Education

MS Data Science

Polytechnic University of Madrid

2021 - 2022

The Polytechnic University of Madrid (UPM) is a public university was founded in 1971 as the result of merging different Technical Schools of Engineering and Architecture, originating mainly in the 18th century. The Engineering and Architecture Schools of UPM have contributed significantly to the history of Spanish technology.

The major subjects:

  • Big Data / Cloud Computing and Big Data Ecosystems Design;
  • Data Processes / Information Retrieval, Extraction and Integration / Time Series Data Mining;
  • Statistical Data Analysis / Machine Learning;
  • Open Data and Knowledge Graphs / Intelligent Systems / Deep Learning;
  • Introduction to Research Methodology / Ethical, Legal and Social Aspects etc.

BE Computer Science & Technology

Beijing Information Science & Technology University

2014 - 2018

The Beijing Information Science and Technology University (BISTU) was founded in 2008 by merging the Beijing Machinery Industry Institute and the Beijing Information Engineering College, which are both created in the 19th century. It offers an excellent academic discipline system in information science and technology and servers around 15,000 students from China and abroad.

The major subjects:

  • Principles of Computer Compositions / Computer Architectures;
  • Compilation Principles / Practice of Embedded Application System;
  • Operating System / Principles and Applications of Database;
  • Advanced Math / Linear Algebra / Discrete Mathematics / Probability and Statistics;
  • Programming Languages: C, C++, Java;
  • Data Structure, Algorithm, etc.

A Little More About Me

Alongside my interests in data and software engineering, some of my other interests and hobbies are: powerlifting, watching UFC, traveling and CrossFit.