About Me
My name is Wenqi Jiang. I’m a software engineer focusing on big data infrastructure and backend services, specializing in scalability, high availability, and low latency distributed systems design and implementation.
I am proficient in Java
and Scala
and I have studied React
and Flutter
to find interesting challenges.
Furthermore, I enjoy powerlifting and traveling in my spare time.
Experience and Projects
Ontology Engineer (Scholarship)
Ontology Engineering Group
Feb. 2022 - Jul. 2022 · 6 mos
oeg.fi.upm.esA student scholarship for the development of linking algorithms and graph linkers.
- Analyzed ontologies from different data sources, like DBpedia, Wikidata;
- Specialized 7 linking discovery algorithms, including string-based similarity and geometry relationships;
- Implemented a Sparql-based linking application by using
Apache Jena
, and evaluated it with the OAEI-2021 datasets; - Written the master’s thesis.
Keywords: Ontology Graph Linker RDF Text Similarity DE-9IM WKT SPARQL Apache Jena OAEI
Big Data Engineer
Shenzhen Togic Software Technology Co.
May 2019 - Aug. 2021 · 2 yrs 4 mos
51togic.com
Shenzhen Togic Software Technology Co. was established in 2010. The company focuses on home Internet TV clients, with its Togic (similar to Netflix) and WEBOX (similar to Apple TV) delivering a simple, intuitive, and lightweight experience for family entertainment, quickly becoming the most popular app on TV systems. There are currently near 500,000 daily users on this platform, and over 9 million customers worldwide used these products.
Offline Data Warehouse
- Maintained the
Hadoop
cluster containing over 200 TB of data, including user and system logs from 3 different sources; - Upgraded big data components, including
Hadoop
,Spark
,Kafka
, etc., and monitored them onCDH
; - Designed and implemented a four-layer data warehouse based on the STAR model by using
Hive
, which had near 150 tables and received over 300 GB per day; - Rewrote the ETL pipeline, which saved over 80% of storage and fixed the small files’ problem on
HDFS
.
Keywords: CDH Kafka Hadoop HIVE Spark
Data Statistics System
- Restructured the statistics system based on the new offline data warehouse;
- Rewrote the code by using
Spark DSL
, making it more readable and maintainable; - Optimized the system’s performance, reduced the process time by 25% compared to the previous version;
- Implemented more than 30 new statistical demands and fixed bugs.
Keywords: Sqoop Hive Spark Scala Elasticsearch MySQL MongoDB Python
Log Collection Service
- Designed and implemented the log collection pipeline using
Nginx
andNetty
, improving performance, stability, and scalability; - Processed all logs into the
Kafka
cluster, which has 5 brokers, 3 replications, and 7 days of redundancy, improving robustness.
Keywords: Nginx Netty Java Kafka HDFS
Backend Engineer
Beijing Webstudio Information Technology Co.
Jul. 2018 - Apr. 2019 · 10 mos
wbdatavis.com
Beijing Webstudio Information Technology Co. was founded in 2007 and focuses on quality-based and innovative development, providing data analysis and visualization, as well as official website marketing solutions which span a wide range of business disciplines and application situations.
- Implemented backend APIs of big data visualization dashboards and general reports for the BI applications;
- Developed more than 25 data operators, such as summation, date conversion, string processing etc.;
- Implemented the one-click function of syncing all user information;
- Expanded 16 data sources, including
MySQL
,SQL Server
,MongoDB
etc.; - Maintained the big data cluster;
- Fixed bugs and developed new features for the previous company’s projects.
Keywords: Dubbo Spring Boot MyBatis Hibernate Java Nginx MySQL Redis Docker FastDFS CDH Hadoop HBase Impala Spark ML
Software Engineer (Internship)
Knowledge Engineering Group
Nov. 2017 - Jun. 2018 · 8 mos
keg.cs.tsinghua.edu.cn
The Knowledge Engineering Group (KEG) of Tsinghua University established in 1996, and devotes to research on social network analysis, news mining, semantic web, knowledge graph construction, etc.
- Organized and annotated critical illness insurance documents;
- Trained models to extract the logical structure of insurance documents by using
GROBID
; - Created a desktop application using above models;
- Maintained datasets in
NoSQL database
, like MongoDB, Neo4j; - Completed the thesis.
Keywords: NLP CRF GROBID Neo4j MongoDB Java FX
Open Source
Definitive Guide for Decentralized-app(Dapp) Development on Blockchain. Step-by-step Dapp practice through actual projects.
- Contributed various tutorial tasks;
- Translated the basic tasks;
- Shared the latest blockchain news and cutting-edge technics in the community.
Keywords: Blockchain web3.js Solidity Dapp Ethereum Hardhat Alchemy Infura DAO.
Education
MS Data Science
Polytechnic University of Madrid
2021 - 2022
The Polytechnic University of Madrid (UPM) is a public university was founded in 1971 as the result of merging different Technical Schools of Engineering and Architecture, originating mainly in the 18th century. The Engineering and Architecture Schools of UPM have contributed significantly to the history of Spanish technology.
The major subjects:
- Big Data / Cloud Computing and Big Data Ecosystems Design;
- Data Processes / Information Retrieval, Extraction and Integration / Time Series Data Mining;
- Statistical Data Analysis / Machine Learning;
- Open Data and Knowledge Graphs / Intelligent Systems / Deep Learning;
- Introduction to Research Methodology / Ethical, Legal and Social Aspects etc.
BE Computer Science & Technology
Beijing Information Science & Technology University
2014 - 2018
The Beijing Information Science and Technology University (BISTU) was founded in 2008 by merging the Beijing Machinery Industry Institute and the Beijing Information Engineering College, which are both created in the 19th century. It offers an excellent academic discipline system in information science and technology and servers around 15,000 students from China and abroad.
The major subjects:
- Principles of Computer Compositions / Computer Architectures;
- Compilation Principles / Practice of Embedded Application System;
- Operating System / Principles and Applications of Database;
- Advanced Math / Linear Algebra / Discrete Mathematics / Probability and Statistics;
- Programming Languages: C, C++, Java;
- Data Structure, Algorithm, etc.
A Little More About Me
Alongside my interests in data and software engineering, some of my other interests and hobbies are: powerlifting
, watching UFC
, traveling
and CrossFit
.