Skip to content. Skip to main navigation.

avatar

Hao Che

Name

[Che, Hao]
  • Associate professor

Biography

Hao Che received the B.S. degree from Nanjing University, Nanjing, China, in 1984, the M.S. degree in physics from the University of Texas at Arlington, TX, in 1994, and Ph.D. degree in electrical engineering from the University of Texas at Austin, TX, in 1998.  He was an Assistant Professor of Electrical Engineering at the Pennsylvania State University, University Park, PA, from 1998 to 2000, and a System Architect with Santera Systems, Inc., Plano, TX, from 2000 to 2002.  Since September 2002, he has been with the Department of Computer Science and Engineering at the University of Texas at Arlington, TX. His current research interests include network architecture and network resource management, performance analysis of large-scale distributed computing systems, including many-core processors, warehouse-scale computing, and cloud computing. 

Professional Preparation

    • 1994 M.S. in Theoretical Physics
    • 1998 Ph.D. in Electrical EngineeringUniversity of Texas at Austin

Appointments

    • Jan 2000 to Jan 2002 Adjunct assistant professor
      Pennsylvania State University
    • Jan 2000 to Jan 2001 System Architech/Embedded System Design Engineer
      Santera Systems, Inc.
    • Jan 1998 to Jan 2000 Assistant Professor(Tenure Track)
      Penn State
    • Jan 1989 to Jan 1991 Research Scientist
      Southwestern Institute of Physics, China
    • Jan 1984 to Jan 1989 Lecturer
      Yunan University,Kunming,China

Memberships

  • Senior Member
    • Mar 2003 to Present IEEE

Research and Expertise

  • Distributed Traffic Control:
    This is an NSF funded project (Sept. 2002 – August 2005) in collaboration with Dr. lagoa in control area from the Penn State University. It has accomplished its goal by developing a solid mathematical foundation on the optimization-based traffic control [3,6,9,10], and as its last milestone, a simulation tool that implements a distributed traffic control architecture in an MPLS domain is close to its completion. The results obtained in this research have advanced the state-of-the-art in distributed traffic control. In particular, the family of distributed control laws found in [3] allows fully distributed class-of-service based load balancing and rate adaptation end-to-end or edge-to-edge. This family of distributed control laws requires single-bit binary congestion information feedback from a forwarding path. Hence, mathematically, it has reached the theoretical limit in minimizing the information feedback from the network. Meanwhile, it is flexible enough to allow more information feedback from the network for performance enhancement. Therefore, this family of distributed control laws provides the much needed theoretical underpinning for the development of traffic control protocols at the IP layer, transport layer, and overlay. For example, with this family of distributed control laws, one can now develop transport layer protocols with or without the involvement of the IP core to enable multiple CoSs with various fairness design criteria and proven global stability.
  • Fast Network Processor Analysis Methodology:
    As network processors are increasingly being used in router interface cards for fast data path processing, how to effectively map protocol functions to network processors to enable rich data path functions while maintaining wire-speed forwarding performance has emerged as a key challenge. The traditional cycle-accurate simulation approach is unviable to help make such a mapping, especially in the initial design phase when microcode is yet to be developed. This on-going project aims to develop a novel, fast NP analysis methodology and tool to meet this challenge. Since very little research has been done in this area, at this point, a sound theoretical foundation are being developed and promising preliminary results are obtained in [29][30]. In [29], some fundamental conditions for a network processor to achieve wire-speed forwarding performance are rigorously established and useful tight memory access latency bounds are derived. In [30], an algorithm is developed that finds network processor performance bounds within a fraction of a second. The performance bounds are found with 17% of the cycle-accurate simulation results.
  • TCAM Coprocessor Design and Management:
    Despite great efforts made, to date, packet classification is still a key potential bottleneck lying in the critical data path. No single algorithm or coprocessor exists, that can meet various packet classification needs simultaneously while matching multi-gigabit line rates. It is technologically and economically unviable to implement individual packet classification solutions to meet various packet classification needs. This substantially limits the ability of a router to support rich, fast data path functions, while being able to keep up with multi-gigabit line rates. The TCAM-based approach holds the promise of fully addressing this issue. In [1] [31], we demonstrated how a distributed TCAM coprocessor architecture for five-tuple firewall filtering can actually achieve four times higher throughput performance with only 50% increase of TCAM memory compared with a single TCAM solution. Encouraged by the success of this work, we are now designing a distributed TCAM coprocessor architecture for multi-task packet classification that matches OC-768 line rate. In parallel to the design of new TCAM architectures, we have also successfully addressed three key TCAM management challenges [4, 7, 34]. In [4], we developed a comprehensive solution to allow efficient use of TCAM storage space for rules with ranges. Differing from any existing solutions that require additional hardware support, our solution can be readily implemented with only minimum software upgrade. In [7], we showed how a TCAM database can be updated in parallel with the database lookup process without impacting the lookup process. This solution completely eliminates the need to lock the TCAM while the database is being updated, making the TCAM-based solution significantly more attractive than any other solutions. Finally, in [34], we identify and address the weight deletion problem for a weight-based TCAM, making this technology free from any weight space overflow.
  • Cross-Layer Optimization in Wireless Networks:
    A key issue in supporting QoS over wireless networks is to estimate the wireless channel capacity. In [2], we propose a new model to predict the available channel capacity during any given time interval with the required degree of confidence. By leveraging large deviations techniques, we relate the fading channel capacity with the theory of effective bandwidth and establish a connection between the theory of effective bandwidth and information theory. Consequently, our results provide a foundation for the performance analysis of upper layer protocols for QoS provisioning over wireless networks. In [28], we propose a new model to characterize the accumulative conditional fade durations for multiple users. We derive analytical expressions and algorithms for computing effective fade duration envelopes for wireless users. We also obtain the probabilistic lower bound on the capacity of wireless multi-user system and the probabilistic upper bound on the delay experienced by traffic arrivals. These results provide a powerful means to exploit the multi-user diversity gain for the design of upper layer protocols for wireless communication systems with delay sensitive applications.
  • Caching in Wired and Wireless Networks:
    In the context of web caching, in [5] [12], we study, from a fundamental point of view, cache replacement algorithms in a hierarchical caching system. In [12], a hierarchical caching system is viewed as a concatenated low-pass filtering system. This view leads to the development of some fundamental design principles for hierarchical caching. [5] improves on [12] by examining a larger set of cache replacement algorithms, providing significant insight in how to effective manage such a system. In [8] [36], an efficient cache consistency maintenance algorithm in a wireless network is proposed. This algorithm combines the good features of both stateful and stateless cache consistency maintenance approaches and leads to overall better performance compared with the traditional stateful and stateless algorithms. .
  • Cloud Computing

    Methodologies and theoretical foundation for both off-line and on-line resource allocation in warehouse-scale computing and cloud computing

  • Tail-Latency SLO-aware Datacenter Resource Provisioning

    This is an NSF funded project (2017-2019) in collaboration with Drs. Jiang and Lei. 

    Todays datacenters support a wide range of applications with diverse service level objectives (SLOs), from the background applications with loose latency requirements to interactive user-facing ones that call for stringent tail latency and throughput guarantees. Failure to meet SLOs by even a small margin for user-facing workloads has proven to cause significant customer churn rate and hence revenue loss. Due to the lack of a good understanding of how to translate SLOs into precise resource requirements, much needed for effective SLO-aware job scheduling, a common practice is to rely on resource overprovisioning and resource isolation to meet tight SLOs. However, with datacenter size, power and server hardware approaching their physical limits, service providers are now under tremendous pressure to consolidate their services to improve resource utilization. Meanwhile, it is tempting from both customer and service providers point of view to enable customized services to meet per-job SLOs. As a result, it is of paramount importance to bridge the gap between SLOs and the underlying system resource requirements.            

    As a first yet crucial step towards a comprehensive solution, this project aims to develop a sound theoretical foundation to tackle the above challenge. It explores fundamental design principles and is cross-layer by design, involving a two-layer design, from applications to runtime system and system architecture. At the upper, application layer, with any given job workflow represented by Directed Acyclic Graph (DAG), the job SLOs are translated into precise latency budgets for individual task nodes in the DAG, independent of the underlying system to be used to run the job. At the lower, runtime system layer, the subsystems for individual task nodes are selected and the resources are allocated to meet all the task performance budgets and hence the job SLOs. This proposed research will enable us to develop per-job schedulers with multi-SLO guarantee, while allowing for service consolidation and achieving high datacenter resource utilization.

Publications

      Journal Article 2016
      • W. Su, C. Lagoa, and H. Che, "Optimization-based, QoS-aware Distributed Traffic Control Laws for Networks with Time-varying Link Capacities," Automatica, Vo. 72, pp. 158-165, Oct. 2016.

        {Journal Article }

      Journal Article 2015
      • W. Su, C. Liu, C. Lagoa, H. Che, K. Xu, and Y. Cui, “A Family of Optimal, Distributed Traffic Control Laws in a Multidomain Environment,” IEEE Transactions on Control Systems Technology, Vol. 23, No.4, pp. 1373-1386, 2015.

        {Journal Article }

      Journal Article 2014
      • M. Ju, H. Jung, and H. Che, “A New Methodology for Design Space Exploration of Multithreaded Multicore Processors,”  IEEE Transactions on Computers,  Vol. 63, No. 2, pp. 276-289, Feb. 2014.  

        {Journal Article }

      Conference Proceeding 2010
      • H. Che, “A Unified Traffic Control Architecture for Future Internet,” a white paper is accepted in 2010 and the author is invited to participate in a group discussion on the direction of the future Internet at FutureHetNets2011, organized by NSF, NASA, and MIT (http://www.rle.mit.edu/futurehetnets/Agenda.htm).
        {Conference Proceeding }

      Conference Paper 2009
      • M. Ju, H. Che, Z. Wang, “Performance Analysis of Caching Effect on Packet Processing in a Multi-Threaded Processor,” the International Conference on Communications and Mobile Computing (CMC 2009), Jan. 2009.
        {Conference Paper }

      Journal Article 2009
      •  L. Ye, Z. Wang, H. Che, H. Chen and C. Lagoa, “Utility Function of TCP,”  Computer Communications, Vol. 32, No. 5, pp. 800-805, 2009
        {Journal Article }
      2009
      • Z. Wang, H. Che, J. Cao, and J. Wang, “A TCAM-based Solution for Integrated Traffic Anomaly Detection and Policy Filtering,” Computer Communications, Vol. 32, No. 17, pp. 1893-1901, 2009.  
        {Journal Article }

      Conference Paper 2008
      • H. Jung, M. Ju, H. Che, and Z. Wang, “A Fast Performance Analysis Tool for Multicore, Multithreaded Communication Processors,” 11th IEEE High Assurance Systems Engineering Symposium (HASE), Dec.  2008.
        {Conference Paper }
      2008
      • X. Sun, Z. Wang, H. Che and F. Zhao, “An End User Enabled MAC-in-MAC Encapsulation Scheme for Metro-Ethernet,” International Symposium on Advances in Parallel and Distributed Computing Technique, December, 2008
        {Conference Paper }
      2008
      • L. Ye, Z. Wang, H. Che, H. Chan, and C. Lagoa, “Utility Function of TCP and Its Application to the Design of Minimum Rate Guaranteed Control Law,” IEEE ICC 2008, May 2008.
        {Conference Paper }
      2008
      • L. Ye, Z. Wang, and H. Che, “A Family of QoS Aware Congestion Control Protocols,” IEEE ICC, May, 2008
        {Conference Paper }

      Journal Article 2008
      • H. Che, Z. Wang, K. Zheng, and B. Liu, “DRES: Dynamic Range Encoding Scheme for TCAM Coprocessors,” IEEE Transactions on Computers, Vol. 57, No. 7, pp. 902-915, July 2008.
        {Journal Article }

      Conference Paper 2007
      • I. Ghosh, H. Che, C. Lagoa, M. Kumar, and S. Das, “SoS: A Service Oriented Scalable Traffic Control Architecture for Future Internet,” IEEE ICC Workshop on Traffic Engineering in Next Generation IP Networks, Jun. 2007.
        {Conference Paper }
      2007
      • W. Su, C. Lagoa, and H. Che, “A Family of Optimization-Based Traffic Control Laws for Overlay Networks,” IEEE CDC 2007, Dec. 2007
        {Conference Paper }

      Journal Article 2007
      • C. Z. Li, H. Che, and S. Q. Li, “A Wireless Channel Capacity Model for Quality of Service,” IEEE Transactions on Wireless Communications, Vol. 6, No. 1, pp. 356-366, Jan. 2007.
        {Journal Article }
      2007
      • B. Movsichoff, C. Lagoa, and H. Che, “Minimal Feedback Optimal Algorithms for Traffic Engineering in Computer Networks,” IEEE/ACM Transactions on Networking, Vol. 15, No. 4, pp. 813-823, August 2007.
        {Journal Article }
      2007
      • C. Li, H. Che, and S. Q. Li, “Fade Statistics of Wireless Multi-user Systems,” IEEE Transactions on Wireless Communications, Vol. 6, No. 7, pp. 2390-2395, July 2007.
        {Journal Article }

      Conference Paper 2006
      • Z. Wang, U. Mani, M. Ju, and H. Che, “A Rate Adaptive Hybrid MAC Protocol for Wireless Celluar Networks,” ICWMC 2006.
        {Conference Paper }
      2006
      • C. Z. Li, H. Che, S. Q. Li, and D. P. Wu, “A new wireless channel fade duration model for exploiting multi-user diversity gain and itsApplications,” the IEEE International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM2006), Jun. 26, 2006
        {Conference Paper }
      2006
      • Y. Cui, H. Che, C. Lagoa, and Z. Zheng, “A Least Interference Path Algorithm for MPLS Traffic Engineering,” The 3rd International Conference on Autonomic and Trusted Computing (ATC-06), Sept. 3-6, 2006.
        {Conference Paper }
      2006
      • Z. Wang, U. Mani, H. Che, and Miao Ju, “A Rate Adaptive Hybrid MAC Protocol for Wireless Cellular Networks,” International Conference on Wireless and Mobile Communications (ICWMC 2006), July 29-31, 2006.
        {Conference Paper }
      2006
      • Y. Cui, H. Che, and Z. Zheng, “A Least Interference Path Algorithm for MPLS Traffic Engineering,” the 3rd IFIP International Conference on Autonomic and Trusted Computing (ATC-06), Sept. 3, 2006
        {Conference Paper }
      2006
      • H. Che, W. Su, C. Lagoa, K. Xu, C. Liu, and Y. Cui, “An Integrated, Distributed Traffic Control Strategy for the Future Internet,” The ACM SIGCOMM INM Workshop 2006.
        {Conference Paper }
      2006
      • H. Che, C. Kumar, and B. Menasinahal, “A Fast Latency Bound Estimation Algorithm for a Multithreaded Network Processor,” the 18th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), Nov. 2006.
        {Conference Paper }
      2006
      • H. Che, M. Gupta, S. Velayutham, C. Lagoa, and Z. Wang, “INTESER: A Integrated Solution to Provide QoS, Traffic Engineering, and Fault Tolerance in an MPLS Network,” the 18th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), Nov. 2006.
        {Conference Paper }
      2006
      • Z. Liu, H. Che, K. Zheng, S. Chen, C. Hu, and B. Liu, “Trace Driven Comparison of Latency Hiding Techniques for Network Processors,” accepted by IEEE International Conference on Communications (ICC2006), Jun. 11, 2006.
        {Conference Paper }

      Journal Article 2006
      • N. Laoutaris, H. Che, and I. Stavrakakis, “The LCD Interconnection of LRU Caches and Its Analysis,” Performance Evaluation Journal, Vol. 63, No. 7, pp. 609-634, 2006.
        {Journal Article }
      2006
      • Z. Kai, H. Che, Z. Wang, B. Liu, “TCAM-based Distributed Parallel Packet Classification Algorithm with Range-Matching Solution,” IEEE Transactions on Computers, Vol. 55, No. 8,  August 2006.
        {Journal Article }

      Conference Paper 2005
      • H. C. Zheng, Z. Wang, and B. Liu. "TCAM-based Distributed Parallel Packet Classification Algorithm with Range-Matching Solution," presented at Proceedings of IEEE INFOCOM'2005, March 2005.
        {Conference Paper }
      2005
      • C. L. Movsichoff and H. Che. "Decentralized Optimal Traffic Engineering in Connectionless Networks," presented at IEEE Journal on Selected Areas in Communications, February 2005.
        {Conference Paper }
      2005
      • C. K. Che and B. Menasinahal. "Fundamental Network Processor Performance Bounds," presented at accepted by the 4th IEEE Internation Symposium on Network Computing and Applications (IEEENCA05), 2005.
        {Conference Paper }

      Journal Article 2005
      • H. C. Laoutaris and I. Stavrakakis. "The LCD Interconnection of LRU Caches and Its Analysis," accepted for publication by Performance Evaluation Journal, 2005.
        {Journal Article }

      Journal Article 2004
      • Zhijun Wang, Hao Che, Mohan Kumar, Sajal K. Das: CoPTUA: Consistent Policy Table Update Algorithm for TCAM without Locking. IEEE Trans. Computers 53(12): 1602-1614 (2004)
        {Journal Article }
      2004
      • Zhijun Wang, Sajal K. Das, Hao Che, Mohan Kumar: A Scalable Asynchronous Cache Consistency Scheme (SACCS) for Mobile Environments. IEEE Trans. Parallel Distrib. Syst. 15(11): 983-995 (2004)
        {Journal Article }
      2004
      • Constantino M. Lagoa, Hao Che, Bernardo A. Movsichoff: Adaptive control algorithms for decentralized optimal traffic engineering in the internet. IEEE/ACM Trans. Netw. 12(3): 415-428 (2004)
        {Journal Article }

      Journal Article 2003
      • Zhijun Wang, Sajal K. Das, Hao Che, Mohan Kumar: SACCS: Scalable Asynchronous Cache Consistency Scheme for Mobile Environments. ICDCS Workshops 2003: 797-802
        {Journal Article }

      Journal Article 2002
      • Xiaoning He, Hao Che: TCP performance analysis and optimization over DMT based ADSL system. Computer Communications 25(3): 322-328 (2002)
        {Journal Article }
      2002
      • Ye Tung, Hao Che: A flow caching mechanism for fast packet forwarding. Computer Communications 25(14): 1257-1262 (2002)
        {Journal Article }

      Journal Article 2001
      • Hao Che, Zhijung Wang, Ye Tung: Analysis and Design of Hierarchical Web Caching Systems. INFOCOM 2001: 1416-1424
        {Journal Article }

      Journal Article 2000
      • Xinwei Hong, Zailu Huang, Hao Che: A Scheduling Method for Bounded Delay Services in High Speed Networks. ICC (2) 2000: 863-867
        {Journal Article }

      Journal Article 1999
      • Hao Che, San-qi Li: MPOA Flow Classification Design and Analysis. INFOCOM 1999: 1497-1504
        {Journal Article }

      Journal Article 1998
      • Hao Che, San-qi Li, Arthur Y. M. Lin: Adaptive Resource Management for Flow-Based IP/ATM Hybrid Switching Systems. INFOCOM 1998: 381-389
        {Journal Article }
      1998
      • Hao Che, San-qi Li, Arthur Y. M. Lin: Adaptive resource management for flow-based IP/ATM hybrid switching systems. IEEE/ACM Trans. Netw. 6(5): 544-557 (1998)
        {Journal Article }

      Journal Article 1997
      • Hao Che, San-qi Li: Fast Algorithms for Measurement-Based Traffic Modeling. INFOCOM 1997: 177-186
        {Journal Article }

      Journal Article Published
      • H. Che, and M. Nguyen, “Amdahl’s Law for Multithreaded Multicore Processors,” J. of Parallel and Distributed Computing, Vol. 74, No. 10, pp. 3056-69, Oct., 2014. 

        {Journal Article }
      Published
      • H. Che, and M. Nguyen, “Amdahl’s Law for Multithreaded Multicore Processors,” J. of Parallel and Distributed Computing, Vol. 74, No. 10, pp. 3056-69, Oct., 2014. 

        {Journal Article }

Courses

      • CSE 6349-002 Special Topics on Advanced Networks

        This is a 6000-level course designed for students in networking track. It is composed of two distinct parts. The first part is a series of regular lectures covering router data plane functions and their programming in router interface cards using multithreaded multicore processors. It is a sequel to CSE5344, exposing students to the implementation aspects of the Internet protocols. The second part is a series of student team presentations on advanced networking research topics, involving various advanced networking techniques, e.g., software defined networking, cloud/datacenter network resource provisioning and virtualization, and overlay networking techniques. Throughout the semester, the student teams will be assigned ten research papers to read, each covering certain aspects of the state-of-the-art networking solutions. Then each team will be randomly selected to present one of these ten papers. The aim is to expose students to advanced networking technologies and stimulate research interests in addressing fundamental challenges facing the design of the future Internet.    

        Spring - Regular Academic Session - 2017 Download Syllabus Contact info & Office Hours
      • CSE 5344-001 COMPUTER NETWORKS

        This course aims at introducing the students to modern computer networks, in particular the Internet. We will discuss basic network architecture, design principles, different protocols, and applications. We will study the application, transport, networking, and link layers. Students are expected to perform two projects, including network programming, to obtain hands-on knowledge.

        Fall - Regular Academic Session - 2016 Download Syllabus Contact info & Office Hours
      • CSE 6349-002 Special Topics on Advanced Networks

        This is a 6000-level course designed for students in networking track. Accordingly, it is composed of two distinct parts. The first part offered in the first half of the semester will be regular lectures covering router data plane processing and programming using multithreaded multicore processors to achieve high speed forwarding performance. This part is a sequel to CSE5344, exposing students to implementation aspects of the Internet protocols. The second part is mainly composed of a series advanced networking related research topics, such as software defined networking, Internet of things, and content distribution networks, presented by student teams as part of a project assignment, The aim is to expose students to advanced networking technologies and stimulate research interests in addressing fundamental challenges facing the design of the future Internet.  

        Fall - Regular Academic Session - 2016 Download Syllabus Contact info & Office Hours
      • CSE 5344-002 COMPUTER NETWORKS

        This course aims at introducing the students to modern computer networks, in particular the Internet. We will discuss basic network architecture, design principles, different protocols, and applications. We will study the application, transport, networking, and link layers. Students are expected to perform two projects, including network programming, to obtain hands-on knowledge.

        Spring - Regular Academic Session - 2016 Download Syllabus Contact info & Office Hours
      • CSE 6350-001 ADVANCED TOPICS IN COMPUTER ARCHITECTURE

        This is a 6000-level course designed for students in both networking and system tracks. Accordingly, it covers two related subjects. The first subject addresses a major Internet router implementation challenge, i.e., how to program router interface cards using the state-of-the-art multithreaded multicore processors or chip multiprocessors (CMPs) to achieve high speed forwarding performance. This subject is a sequel to CSE5344 and it also serves as a motivating case study for the second subject. The second subject provides in-depth coverage of the emerging CMPs and warehouse-scale computers (WSCs) and is a sequel to CSE5350. The aim is to not only cover known facts but also stimulate research interests in addressing fundamental challenges facing the design and programming of CMPs and WSCs. As the number of cores in a CMP/WSC ever increases, how to design and program CMPs/WSCs to achieve desired performance for various workloads becomes a challenge. Clearly, the traditional uniprocessor analysis approaches, such as benchmark testing and cycle-accurate simulation, quickly become ineffective as the number of cores in a CMP/WSC increases. To tackle this challenge, this course will introduce a novel thread-level analysis methodology for large design space exploration of CMP/WSC. In particular, based on this methodology, initial results on the development of performance bound analysis, bottleneck resource identification, simulation, and analytical modeling techniques, all at the thread level, will be introduced. These techniques will be further explored and applied to the analysis of various aspects of CMP/WSC architectures by the students in a term project.    

        Spring - Regular Academic Session - 2016 Download Syllabus Contact info & Office Hours
      • CSE 5344-001 COMPUTER NETWORKS

        This course aims at introducing the students to modern computer networks, in particular the Internet. We will discuss basic network architecture, design principles, different protocols, and applications. We will study the application, transport, networking, and link layers. Students are expected to perform two projects, including network programming, to obtain hands-on knowledge.

        Fall - Regular Academic Session - 2015 Download Syllabus Contact info & Office Hours
      • CSE 6350-001 ADVANCED TOPICS IN COMPUTER ARCHITECTURE

        This is a 6000-level course designed for students in both networking and system tracks. Accordingly, it covers two related subjects. The first subject addresses a major Internet router implementation challenge, i.e., how to program router interface cards using the state-of-the-art multithreaded multicore processors or chip multiprocessors (CMPs) to achieve high speed forwarding performance. This subject is a sequel to CSE5344 and it also serves as a motivating case study for the second subject. The second subject provides in-depth coverage of the emerging CMPs and warehouse-scale computers (WSCs) and is a sequel to CSE5350. The aim is to not only cover known facts but also stimulate research interests in addressing fundamental challenges facing the design and programming of CMPs and WSCs. As the number of cores in a CMP/WSC ever increases, how to design and program CMPs/WSCs to achieve desired performance for various workloads becomes a challenge. Clearly, the traditional uniprocessor analysis approaches, such as benchmark testing and cycle-accurate simulation, quickly become ineffective as the number of cores in a CMP/WSC increases. To tackle this challenge, this course will introduce a novel thread-level analysis methodology for large design space exploration of CMP/WSC. In particular, based on this methodology, initial results on the development of performance bound analysis, bottleneck resource identification, simulation, and analytical modeling techniques, all at the thread level, will be introduced. These techniques will be further explored and applied to the analysis of various aspects of CMP/WSC architectures by the students in a term project.    

        Fall - Regular Academic Session - 2015 Download Syllabus Contact info & Office Hours
      • CSE 5344-001 Computer Networks

        This course aims at introducing the students to modern computer networks, in particular the Internet. We will discuss basic network architecture, design principles, different protocols, and applications. We will study the application, transport, networking, and link layers. Students are expected to perform two projects, including network programming, to obtain hands-on knowledge.

        Fall - Regular Academic Session - 2014 Download Syllabus Contact info & Office Hours
      • CSE 6350-001 Advanced Computer Architecture

        This is a 6000-level course designed for students in both networking and system tracks. Accordingly, it covers two related subjects. The first subject addresses a major Internet router implementation challenge, i.e., how to program router interface cards using the state-of-the-art multithreaded multicore processors or chip multiprocessors (CMPs) to achieve high speed forwarding performance. This subject is a sequel to CSE5344 and it also serves as a motivating case study for the second subject. The second subject provides in-depth coverage of the emerging CMPs and warehouse-scale computers (WSCs) and is a sequel to CSE5350. The aim is to not only cover known facts but also stimulate research interests in addressing fundamental challenges facing the design and programming of CMPs and WSCs. As the number of cores in a CMP/WSC ever increases, how to design and program CMPs/WSCs to achieve desired performance for various workloads becomes a challenge. Clearly, the traditional uniprocessor analysis approaches, such as benchmark testing and cycle-accurate simulation, quickly become ineffective as the number of cores in a CMP/WSC increases. To tackle this challenge, this course will introduce a novel thread-level analysis methodology for large design space exploration of CMP/WSC. In particular, based on this methodology, initial results on the development of performance bound analysis, bottleneck resource identification, simulation, and analytical modeling techniques, all at the thread level, will be introduced. These techniques will be further explored and applied to the analysis of various aspects of CMP/WSC architectures by the students in a term project.    

        Fall - Regular Academic Session - 2014 Download Syllabus Contact info & Office Hours
      • CSE 6350-001 ADVANCED TOPICS IN COMPUTER ARCHITECTURE
        No Description Provided.
        Spring - Regular Academic Session - 2013 Download Syllabus
      • CSE 4344-001 COMPUTER NETWORK ORGANIZATION
        No Description Provided.
        Spring - Regular Academic Session - 2013 Download Syllabus