Resources:
Google
Google Scholar
CiteSeer
DBLP Bibliography
Course Information:
|
Instructor: Chengkai Li
|
TA: Yuanzhe Cai
|
Course Description: We will study papers on Web Search, Mining, and Integration, covering topics in databases, data mining, information retrieval, and the intersections of these areas. The goals of the course are: to expose graduate students to the cutting-edge of research in these areas; to equip them with the necessary skill sets for finding jobs; to help them identify research topics and come up with preliminary works through course projects; and to prepare new students for doing research with faculty in databases, data mining, and information retrieval. Detailed topics include:
Prerequisites:
CSE 3330/5330 Database Systems
I
or
CSE 5334
Data Mining or similar
courses or consent of instructor
There is no exam. We will focus on paper review, presentation, and project.
Announcements: Stay tuned and make sure to check BlackBoard frequently. Important announcements will be posted there.
Every student must equally contribute to group project (if you are not doing it individually). Only one student in a group needs to upload project-related assignments into BlackBoard.
Regrading: Regrading request must be made within 7 days after we post scores in BlackBoard. TA will handle regrade requests. If student is not satisfied with the regarding results, you get 7 days to request again. The instructor will regrade, and the decision is final.
BlackBoard:
Log in to BlackBoard with your NetID and password. We use BlackBoard for: (1) Announcements; (2) Assignment Submission; (3) Discussion; (4) Releasing materials, assignments, scores and grades.
Ethics Policies and Academic Integrity: The College cannot and will not tolerate any form of academic dishonesty by its students. This includes, but is not limited to cheating on examinations, plagiarism, or collusion (explained in the document below). Students are required to read the following document carefully, sign it, return the signed copy to the instructor, and keep a copy for their own records. Hardcopies of this document will be provided to the students in the first class, and also can be picked up in the instructor's office. If you print by yourself, please make it double-sided.
Statement on Ethics, Professionalism, and Conduct for Engineering Students
Miscellaneous: If you require accommodation based on disability, I would like to meet with you in the privacy of my office during the first week of the semester to ensure that you are appropriately accommodated. Please read the page of the office for students with disabilities.
Date | Lecture/Activities |
Presenter |
Due |
Lecture Notes |
01/19 | To be rescheduled | |||
Introduction | ||||
01/24 | Course Overview | Chengkai Li | [PDF] | |
01/26 |
Entity-Relationship Queries over Wikipedia. Xiaonan Li, Chengkai Li, Cong Yu. In Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents (SMUC 2010), pages 21-28, Toronto, Canada, October 2010. (Co-located with CIKM 2010) |
Xiaonan Li | ||
01/31 |
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wikipedia. Chengkai Li, Ning Yan, Senjuti Basu Roy, Lekhendro Lisham, Gautam Das. To appear in Proceedings of the 19th International World Wide Web Conference (WWW 2010), Raleigh, North Carolina, April 2010. |
Ning Yan | ||
02/02 |
Course Project Topics |
Chengkai Li | ||
02/07 | Paper Review, Presentation, Research Resources | Chengkai Li | ||
02/09 |
Paper Review, Presentation, Research Resources (cont'd) |
Chengkai Li | ||
Semantic Web | ||||
02/14 |
SemTag |
|||
02/16 |
YAGO Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum: YAGO: A Large Ontology from Wikipedia and WordNet. J. Web Sem. 6(3): 203-217 (2008) |
|||
02/21 |
Simone Paolo Ponzetto, Michael Strube:
Deriving a Large-Scale Taxonomy from Wikipedia. AAAI 2007: 1440-1445 Cäcilia Zirn, Vivi Nastase, Michael Strube: Distinguishing between Instances and Classes in the Wikipedia Taxonomy. ESWC 2008: 376-387 |
Proposal | ||
Entity Recognition and Disambiguation | ||||
02/23 |
D. Milne and I. H. Witten.
Learning to link with Wikipedia. In CIKM ’08, pages 509–518, 2008. R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM ’07, pages 233–242, 2007. |
|||
02/28 |
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of Wikipedia entities in Web text. In KDD ’09, pages 457–466, 2009. X. Han and J. Zhao. Named entity disambiguation by leveraging Wikipedia semantic knowledge. In CIKM ’09, pages 215–224.
QUIZ
|
|||
Information Extraction | ||||
03/02 |
Machine Learning Approach, Wrapper |
|||
03/07 |
KnowItAll |
|||
03/09 |
TextRunner and Open Information Extraction |
|||
03/14 |
spring break |
|||
03/16 | ||||
Structured Querying Over the Web, Entity Search and Ranking | ||||
03/21 |
ExDB Michael J. Cafarella, Christopher Re, Dan Suciu, Oren Etzioni: Structured Querying of Web Text Data: A Technical Challenge. CIDR 2007: 225-234 |
|||
03/23 |
EntityRank Tao Cheng, Xifeng Yan, Kevin Chen-Chuan Chang: EntityRank: Searching Entities Directly and Holistically. VLDB 2007: 387-398 |
|||
03/28 | Soumen Chakrabarti, Kriti Puniyani, Sujatha Das: Optimizing scoring functions and indexes for proximity search in type-annotated corpora. WWW 2006: 717-726 | Progress Report | ||
03/30 |
SQoUT Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay Jain, Luis Gravano: To search or to crawl?: towards a query optimizer for text-centric tasks. SIGMOD Conference 2006: 265-276 QUIZ |
|||
04/01 | Last day to drop class | |||
Guest Lectures | ||||
04/04 | ||||
04/06 | ||||
Web Data Mining (cont'd) | ||||
04/11 |
Clustering Web Search Results |
|||
04/13 |
GABRILOVICH, G. AND S. MARKOVITCH. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. IJCAI’07, p.1606–1611. |
|||
04/18 | MILNE, D. AND WITTEN, I.H. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. WIKIAI'08 | |||
Social Networks | ||||
04/20 |
Tagging |
|||
04/25 | Shenghua Bao, Gui-Rong Xue, Xiaoyuan Wu, Yong Yu, Ben Fei, Zhong Su: Optimizing web search using social annotations. WWW 2007: 501-510 | |||
04/27 | QUIZ | |||
05/02 | Final Report, Presentation and Demo Slides, source code | |||
05/04 | Project presentation and Demo | |||
05/09 | 5:30-8pm, Project presentation and Demo |
University calendar: Spring 2011