This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
misc
Running
I like to have some running time to time:) Check out my Athlinks. I made Chrome extentions to quickly download the Garmin Activities and Suunto Activities (in .tcx):
Analyze the real world process management log and apply process mining algorithm to detect the real patterns of the workflow. Lots of exceptions and bottlenecks are found on monthly basis, team basis and process basis respectively. More analysis on the possible reasons and the suggested improvements are proposed and confirmed by enterprise data specialist.
Skills: ProM, Python
Tracking Wellness via Android Phones
An Android application is developed and was used to collect users data including their movement at indoor and outdoor, the phone movement, de-identified messages and emails, phone calls. All the data is encrypted and stored on secure server. Deep learning algorithm is applied to the data and user depression level is estimated based user’s historical data. Possible feedbacks and suggestions on users behavior are sent to the user if the depression risk is detected.
Achievements: four phrases of pilot studies were conducted among a wide range of students at University of Delaware. The feedbacks from the users show that the application was very easy to use and the functionality is very useful.
Enable Rich Text Search over Complex Relationships
MongoDB and Solr are integrated in order to deliver an infrastructure which enables real time insert, update, delete as well as rich search features.
Skills: MEAN web service stack, Mongo-Solr Sync
Contextual Suggestion
User profiling is essential in contextual suggestion. However, given most users’ observed behaviors are sparse and their preferences are latent in an IR system, constructing accurate user profiles is generally difficult. We focus on location-based contextual suggestion and propose to leverage users’ opinions to construct the profiles and thus significantly improve the system over category or description based user profile modeling approaches.
Achievements:
Top 3 Performance on Contextual Suggestion track of the Text REtrieval Conference(TREC), 2015
Top 1 Performance on Contextual Suggestion track of the Text REtrieval Conference(TREC), 2014
Top 1 Performance on Contextual Suggestion track of the Text REtrieval Conference(TREC), 2013
Top 3 Performance on Contextual Suggestion track of the Text REtrieval Conference(TREC), 2012
Axoimatic Ranking Models for Web Information Retrieval
Axiomatic approach basically searches for the retrieval functions that satisfy some reasonable retrieval constraints. We can also incorporate it with the semantic term matching method which does the query expansion by choosing the semantically related terms. We tested the effectiveness of combining the two methods and applied it to web search. The results turned out to be promising.
Achievements:
Top 1 Performance on Web track of the Text REtrieval Conference(TREC), 2014
Top 2 Performance on Web track of the Text REtrieval Conference(TREC), 2013
Skills: Indri, Python, Clueweb
Bag-of-Terms IR Ranking Model Performance Bound Analysis
Most traditional retrieval models are based on bag-of-term representations, and they model the relevance based on various collection statistics. Despite these efforts, it seems that the performance of bag-of-term based retrieval functions has reached plateau, and it becomes increasingly difficult to further improve the retrieval performance. Thus, one important research question is whether we can provide any theoretical justifications on the empirical performance bound of basic retrieval functions.
Interactive Search over Academic IR Datasets with Automated Evaluation
Unlike existing command line based IR toolkits, RISE/VIRLab provides a more interactive tool that enables easy implementation of retrieval functions with only a few lines of codes, simplified evaluation process over multiple data sets and parameter settings and straightforward result analysis interface through operational search engines and pair-wise comparisons.
Skills: LAMP, Docker
Anserini – Enabling Lucene for Academic IR Research
Lucene has long history of industrial adoption while it was seldom used by academic community large due to the lack of documentation and code examples. We build Anserini on top of Lucene to enable: (1) Scalable, multi-threaded inverted indexing to handle modern web-scale collections, (2) Streamlined IR evaluation for ad hoc retrieval on standard test collections, and (3) Extensible architecture for multi-stage ranking. Anserini ships with support for many TREC test collections, providing a convenient way to replicate competitive baselines right out of the box.
Skills: Lucene
publications
Taifei Zhao, Xizheng Ke and Peilin Yang. Local-map-based candidate node-encircling pre-configuration cycles construction in survivable mesh networks. In First International Conference on Future Informa- tion Networks (JICFIN'2009), 2009. (PDF )
Peilin Yang, Xizheng Ke and Taifei Zhao. Study of Ultraviolet Mobile Ad Hoc Network. In Symposium on Photonics and Optoelectronics (SOPO'2009), 2009. (PDF )
Taifei Zhao, Xizheng Ke and Peilin Yang. Position and Velocity Aided Routing Protocol in Mobile Ad Hoc Networks. International Journal of Digital Content Technology and its Applications (JDCTA'2010), 2010. (PDF )
Peilin Yang and Hui Fang. An exploration of ranking-based strategy for contextual suggestion. In Proceedings of the 21st Text REtreival Conference (TREC'2012), 2013. (PDFSlides )
Peilin Yang and Hui Fang. Opinion-based User Profile Modeling for Contextual Suggestions. In Proceedings of the 2013 Conference on the Theory of Information Retrieval (ICTIR'2013). ACM, New York, NY, USA, Pages 18 , 4 pages. 2014. (PDFSlides )
Peilin Yang and Hui Fang. Opinion-based User Profile Modeling for Contextual Suggestions. In Proceedings of the 22nd Text REtreival Conference (TREC'2013), 2013. (PDFSlides )
Peilin Yang and Hui Fang. Evaluating the Effectiveness of Axiomatic Approaches in Web Track. In Proceedings of the 22nd Text REtreival Conference (TREC'2013), 2013. (PDFSlides )
Xitong Liu, Peilin Yang and Hui Fang. EntEXPO: An Interactive Search System for Entity-Bearing Queries. In Proceedings of the 36th European Conference on IR Research on Advances in Information Retrieval - Volume 8416 (ECIR'2014), Vol. 8416. Springer-Verlag New York, Inc., New York, NY, USA, 784-788. (PDF )
Hui Fang, Hao Wu, Peilin Yang, Chengxiang Zhai. VIRLab: A Web-based Virtual Lab for Learning and Studying Information Retrieval Models. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (SIGIR'2014). ACM, New York, NY, USA, 1249-1250. (PDFSlidesDemo )
Peilin Yang and Hui Fang. Exploration of Opinion-aware Approach to Contextual Suggestion. In Proceedings of the 23rd Text REtreival Conference (TREC'2014), 2014. (PDFSlides )
Xitong Liu, Peilin Yang and Hui Fang. Entity Came to Rescue - Leveraging Entities to Minimize Risks in Web Search. In Proceedings of the 23rd Text REtreival Conference (TREC'2014), 2014. (PDF )
Peilin Yang and Hui Fang. Combining Opinion Profile Modeling with Complex Context Filtering for Contextual Suggestion. In Proceedings of the 24th Text REtreival Conference (TREC'2015), 2015. (PDFSlides )
Peilin Yang Hongning Wang, Hui Fang, and Dend Cai. Opinions matter: a general approach to user profile modeling for contextual suggestion. Information Retrieval 18, 6 (December 2015), 586-610. (PDF )
Peilin Yang and Hui Fang. A Reproducibility Study of Information Retrieval Models. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval (ICTIR'2016). ACM, New York, NY, USA, 77-86. (PDFSlidesDemo )
Peilin Yang and Hui Fang. Estimating Retrieval Performance Bound for Single Term Queries. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval (ICTIR'2016). ACM, New York, NY, USA, 237-240. (PDFSlidesDemo )
Peilin Yang, Hui Fang and Jimmy Lin. Anserini: Enabling the Use of Lucene for Information Retrieval Research.. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'2017). ACM, New York, NY, USA, 1253-1256. (PDFDemoCode )
Leif Azzopardi, Matt Crane, Hui Fang, Grant Ingersoll, Jimmy Lin, Yashar Moshfeghi, Harrisen Scells, Peilin Yang, and Guido Zuccon. The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'2017). ACM, New York, NY, USA, 1429-1430. (PDF )
Peilin Yang and Hui Fang. Can Short Queries Be Even Shorter?. In Proceedings of the 2017 ACM on International Conference on the Theory of Information Retrieval (ICTIR'2017). ACM, New York, NY, USA, 43-50. (PDFSlides )
Yue Wang, Peilin Yang, Hui Fang. Evaluating Axiomatic Retrieval Models in the Core Track. In Proceedings of the 26th Text REtreival Conference (TREC'2017), 2017. (PDF )
Kuang Lu, Peilin Yang, Hui Fang. Silent Day Detection in Real-Time Summarization Track. In Proceedings of the 26th Text REtreival Conference (TREC'2017), 2017. (PDF )
Peilin Yang and Hui Fang. Towards Privacy-Preserving Evaluation for Information Retrieval Models Over Industry Data Sets. In Proceedings of the 13th Asia Information Retrieval Societies Conference (AIRS'2017). Springer International Publishing, Jeju Island, South Korea, 210-221. (PDFSlides )
Peilin Yang, Srikanth Thiagarajan, and Jimmy Lin. Robust, Scalable, Real-Time Event Time Series Aggregation at Twitter. In Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data (SIGMOD'2018), June 2018, Houston, Texas. (PDFSlides )
Peilin Yang, Hui Fang, and Jimmy Lin. 2018. Anserini: Reproducible Ranking Baselines Using Lucene. J. Data and Information Quality 10, 4, Article 16 (October 2018), 20 pages. (PDF )
Peilin Yang and Jimmy Lin. Anserini at TREC 2018: CENTRE, Common Core, and News Tracks. In Proceedings of the 27th Text REtreival Conference (TREC'2018), 2018. (PDF )
Peilin Yang, and Jimmy Lin. Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval. In Proceedings of the 41th European Conference on Information Retrieval, Part I (ECIR'2019), pages 369-381, April 2019, Cologne, Germany. (PDF )
Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models.. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'2019). July 2019, Paris, France. (PDF )
Jimmy Lin and Peilin Yang. The Impact of Score Ties on Repeatability in Document Ranking.. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'2019). July 2019, Paris, France. (PDF )