High-performance computing, smart grid, security, assistive technologies, energy-aware computing, and multimedia, which are elaborated below:
(a) High-Performance Computing Systems
My approach is to design fast, and scalable game theory based dynamic coarse- and fine-grained data allocation and replication algorithms [J59], [J65], [J74] such that the system acts as virtually self-tunable and repairable. The objective is to ensure that not only certain performance measures such as the response-time and communication costs are minimized but also a high degree of dependability is maintained despite the failure of some components. I have developed real-time auction and bidding techniques, which focus on exploring the possibility of hybrid cooperative and non-cooperative games. I have also proposed replica allocation techniques based on game theory’s novice sub-field called algorithmic mechanism design [J65], [J68], [C105], [C107], [C107], [C110], [C115], [C123]. Using a similar approach, I am designing scheduling and load balancing algorithms for grid computing [C110].
I have been investigating various resource allocation techniques for Grid computing environments. Numerous solutions for resource allocation have been proposed that span from the traditional optimization techniques such as linear programming to the computational economic models such as auctions. However, in many approaches the notion of distributed administration of resources is rather blurred. Particularly, when a task arrives at the portal of a Grid, the decision of where to send this task for fastest execution is performed at the portal itself. This does not constitute a true distributed administration of resources. Thus, we seek to understand what possible techniques can be studied that would result in efficient and effective resource allocation in Grid environments where resources are administrated by both centralized and decentralized authorities. In my approach, resource management of grids is expressed by agent-based resource bidding models, using both cooperative and non-cooperative schemes [C122]. This competitive nature (of users and resource owners) is expressed via computational agents.
(b) Video Compression
In the realm of video compression, my research has been in four sub areas: (a) bit rate control in video compression to regulate the bit stream while maximizing the video quality [J44], [J50], [J51], [J56]; (b) content and power aware (such as the battery power) video compression and decompression software tools built around standards such as H.263, MPEG-4, and H.264 (the new standard developed by ISO and ITU). I have designed motion estimation and rate controlled bit allocation algorithms whose complexity can be dynamically controlled by the content and the available power of the underlying device [J49]; (c) transcoding algorithms for inter-operation and conversion of one video format to another, allowing users to exchange multiple video format in a seamless fashion in pervasive environments [J50]; (d) delivery of scalable multimedia content in pervasively deployed networks comprising diverse bandwidth and wearable devices [J31], [J43], [C88].
The exchange of video information between remote sites requires that the digital video be encoded and transmitted through specified network connections. Due to wide fluctuations in content, the amount of compressed video data varies in a rather unpredictable manner. A problem rises when the compressed video data rate is inconsistent with the channel bandwidth. The data rate may be lower than the bandwidth, which leads to unnecessary degrading of visual quality and waste of bandwidth; or higher than the bandwidth constraints, which results in traffic congestion and possible loss of data. Rate control is required to regulate the compressed video bit rate to match the channel capacity. My contributions in this regard include game theory based rate control schemes for optimizing the bit allocation in video compression. I am the first researcher who used game theory in video compression. One of my algorithms utilizes a cooperative bargaining game to optimize the perceptual quality while guaranteeing “fairness” in bit allocation among the macroblocks of a frame [J51]. The macroblocks of a frame play cooperative games such that each block of an image competes for its share of resources (bits) to optimize its visual quality. Since the whole frame is an entity perceived by viewers, macroblocks compete cooperatively under a global objective of achieving the best quality for the
entire frame with under a given bit constraint. The major advantage of the proposed approach is that the cooperative game convolves to a Nash Bargaining Solution that leads to an optimal and fair bit allocation. Another advantage is that it allows multi-objective optimization with multiple decision makers (e.g., macroblocks). The second algorithm is based on a none-cooperative game and is intended for multiple video objects rate control. In this algorithm, multiple objects non-cooperatively aim to take their fair share of the bit budget to control their visual quality. This algorithm ensures an optimal solution “under the given circumstances” by confirming to Nash Equilibrium.
During the past ten years, various video coding standards have emerged, targeting different application areas. Standardization is necessary to ensure that products developed by various companies can inter-operate with each other using a common set of rules. Video coding standards, however, only specify formats (syntax) for representing data, and rules (semantics) for interpreting the data, but do not dictate their encoding techniques (algorithms). This implies that there does not exist a unified way of implementing the encoder, and continuing improvements in the encoder are possible with various kinds of optimizations even after a standard has been defined. I have done research in standard compliant video coding, contributing with new rate control and motion estimation algorithms.
In a current project funded by Sun Microsystems, I am developing state-of the-art H.264 (H.264) and AVS compliant software-based video encoders and decoders. The latest H.264 video-coding standard provides significant improvements in coding efficiency compared to previous coding standards. The AVS standard is being developed by China, with active participation of several research labs and universities. However, the AVS and H.264 encoders suffer from tremendously high complexity, posing a challenge for researchers to design innovative algorithms. Generating a good visual quality with reduced complexity is an open research problem, which is largely influenced by the motion estimation and rate control algorithms used in the encoder [J50], [J56].
My research in this project is to develop “intelligent and expert” motion estimation algorithms [J66] that can adapt to the video sequence characteristics and bit rate in order to yield the best quality with the fastest search time. The algorithms analyzes the characteristics of a video sequence and maintains a memory of the “history” of the video scene with low overhead [J52]. The algorithms then utilize this meta information to tunes its parameters to suit the video sequence, thereby achieving the best visual quality as well as lowering its computational complexity.
(c) Collaboration with Police
At UTA I am have established sustainable partnerships with law enforcement agencies. The Institute for Research In Security (IRIS, www.iris.uta.edu) at UTA is a manifestation of these efforts. IRIS aims to develop innovative means (concepts, components, and systems) for improving the ability to maintain secure, usable environments. As director of IRIS, my key contribution is building a multi-disciplinary team, including faculty members from the colleges of engineering, psychology and liberal arts. My second contribution is establishing an alliance with the Arlington Police Department (APD), which has been instrumental in expanding our research by assimilating police officers’ expertise and experience of real-world scenarios. Through this partnership, we have secured three major research grants in security related projects that amount to more than
$2,000,000 [G29], [G31], [G33]. In addition to the APD, collaborative initiatives with police departments from several counties in North Texas as well as the Department of Homeland Security are being pursued. I have also established partnerships with the University of North Texas, Southern Methodist University, and the University of Texas at Dallas. A notable effort is the COPSe (Collaboration for Public Safety Enhancement) project that we are currently pursuing in close collaboration with the APD. Funded by the department of Justice, COPSe is a distributed and networked system for providing and disseminating accurate, timely, and reliable audio-visual information, in a pervasive fashion, anywhere and anytime. COPSe provides police officers with powerful technological means to fight crime and to ensure their own safety as well as that of the environment and citizens. One of its components is a ubiquitous communication system using wearable devices attached to
officers’ uniforms. This multi-party video communication has greatly benefited from my on-going research in video compression, wireless communication, and video streaming.
(d) Broader Green and Sustainable Computing for Distributed Systems
With the explosive growth in computers and the growing scarcity in electric supply, reduction of energy consumption in large-scale computing systems has become a research issue of paramount importance. I have studied the problem of allocation of tasks onto a computational grid, with the aim to simultaneously minimize the energy consumption and the makespan subject to the constraints of deadlines and tasks’ architectural requirements. I have also been addressing the same problem on multi-core platforms. On the other end of the spectrum, at the component level, multi-core processors are poised to dominate the landscape of next generation computing. However, lack of generally applicable methods and tools for allocating tasks to cores while economizing energy remains a key challenge for many application environments. I have proposed a new theoretical and experimental framework called multi-element and multi-objective (MEMO) optimization that simultaneously and flexibly aims to optimize the goals of energy minimization and performance maximization while taking into account constraints due to multiple architectural elements such as cores, caches, etc. of current and emerging multi-core processors. I proposed solutions from cooperative game theory based on the concept of Nash Bargaining Solution [J68], [J78], [J79], [C124], [C135], [C137]. Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. This is currently my main research thrust. Big data center and large-scale computing infrastructures consume substantial amount of energy due to their massive sizes. The energy requirement for these systems of systems for providing the power and cooling is becoming comparable to the cost of acquisition. While a vast number of tools are available for service providing managers to schedule jobs on such systems, there is a lack of generally applicable methods for reducing energy consumption while ensuring good quality of service.
Optimization of performance and energy consumption for big data centers and cloud computing systems requires developing proper scheduling mechanisms. Since energy and performance are inversely proportional via a non-linear relationship, workflow scheduling to meet the timing requirements while economizing energy is a dual objective optimization problem. This is further accentuated by the fact that different classes of applications may have different priorities and require diverse energy-performance tradeoffs. I am leading several efforts in sustainable computing and computing for sustainability. This includes launching of a new journal with Elsevier, Sustainable Computing: Informatics and Systems (SUSCOM)) of which he is the founding editor-in-chief, and launching of the International Green Computing Conference. Power-aware “green” computing requires a comprehensive and multi-pronged approach that involves myriad research issues. Only a collective and holistic approach can lead to overall energy savings and have a positive impact on the environment. I am interested in an inter-disciplinary approach that will encompass several research issues, including new means of reconfiguration and alternate energy for computers. Green Computing spans several innovative research projects in interwoven multi-disciplinary research themes, which would be of interest to many faculty members from engineering and social sciences departments.
(e) Performance, Energy, and Temperature Aware Scheduling on Multi-Core Processors and Clouds
The sophistication of the modern hardware is making strides greater than ever before. Manycore designs have been able to sustain sharp improvements in processor performance. Multicore-based hybrid processors (MHPs) consisting of 2D and 3D layers of many cores and specialized accelerators such as Graphical Processing units (GPUs) can provide multi teraflop peak performance with improved power requirements. Most Exascale machines are expected to consist of a large number of MHP nodes. The massive parallelism and multi-dimensional heterogeneity of current and future high performance platforms differentiate sharply from the machines of the past, and impose new challenges in executing complex scientific and engineering applications– most notably related to power and thermal constraints, motivating the development of
multi-objective optimized algorithms that address not only performance, but energy and temperature constraints as well.
To capitalize on the computational capabilities of the new hardware while explicitly considering the energy and thermal issues, I am developing performance, energy and temperature (PET) optimized scheduling scheme on current and future generations of the scalable MHPs. Taking into account the complex architectures encompassing a myriad of heterogeneity, accelerators, and multi-level caches, with chip-level and node-level scalability, I am working on developing a multi-layer scheduling system for the next generation parallel machines including the Exascale system. The work also involves collaboration with AMD and Intel. In the scheduling system, a parallel computation represented by an Iterative Task Graph (ITG) and associated multi-level data layers will be hierarchically unfolded as the scheduling moves down from nodes to chips and from chips to CPU/GPU cores. At the node level, the scheduling system will partition a parallel computation’s control and data. A lower layer will perform task-to-core mappings and voltage-frequency settings while addressing communication and cache optimization. In addition to Dynamic Voltage and Frequency (DVF) settings, I am investigating core disabling, cache reconfiguration to reduce energy and thermal requirements. For achieving the multi-objective optimization, I am utilizing novel optimization techniques, such as game theory and dynamic programming, and their combination. I am aiming to develop a run-time system that will allow the user to specify the relative goals of optimizing performance, energy and temperature. The system will employ a set of static scheduling algorithms to estimate (by simulating the task graph of the routine) if these goals can be met. The system will provide Pareto solutions in a 3D space and allow the user to interactively move the desire point, using application and architecture knobs (size, cores, communication to computation ratio, etc.) in the PET space. If the desired goals cannot be met, the system will propose alternative solutions with trade-offs in P, E, or T. Once the user has selected a schedule, the system will map the desired program at the target platform. The two expected contributions of my research are:
• Development of a new generation of efficient methodologies for solving the complex online triple-objective-optimized task-to-machine static and dynamic allocation problem on MHP. Parallel and distributed processing of seemingly sequential but non-trivial scheduling algorithms will create transformational knowledge, leading to extensive publication activity. The outcome will include fast static and dynamic algorithms, allowing users and programmers to execute parallel computations with efficient energy consumptions, temperature goals, and performance levels.
• Development of a novel and hierarchical framework for estimating the energy and temperature budgets and consumptions of the system. My research will contribute towards addressing the performance limitations imposed by the soaring power dissipation and temperature hotspot issues of future generation systems. It will alleviate bottlenecks for existing systems and provide alternative approaches for future designs.
(f) Next Generation Software for Exascale Systems
Performance, energy and temperature (PET) are now recognized to be primary goals in almost all forms of computing but especially in scalable high-performance computing system where massive computations can incur high energy bills and cooling costs. In collaboration with the University of Tennessee and University of Florida, I am working towards developing PET-optimized linear algebra libraries on current and future generation scalable hybrid computing systems. Specifically, we focus on Krylov subspace projection methods which are widely used iterative methods for solving large-scale linear systems of equations. We are addressing performance-critical challenges related to synchronization, communication, and latency costs. This will be explored for linear system solvers, e.g., variants of GMRES, QMR, BiCGSTAB, and CG, block versions for handling linear systems with multiple right-hand sides, as well as block iterative eigensolvers (variations of Lancsoz, Jacobi-Davidson, and LOBPCG). Our efforts will first focus on developing the PET parameterized algorithms for executing these solvers on the heterogeneous architectures envisioned by PET- optimized Algebraic Library System (PETALS). The proposed research effort encompasses the following components:
1. Taking into account the increasingly complex machine architectures such as heterogeneity, multi-level caches, as well as chip-level and node-level scalability, we propose a multi-layer hierarchical scheduling system employing game theory, dynamic programing and evolutionary techniques. The scheduler will employ partitioning of control and data, utilizing Hierarchical Adaptive Tiling (HAT) and Interactive Task Graphs (ITG) for capturing the inter- and intra-iteration dependencies. A combination of optimization techniques will be utilized for static and dynamic resource management by hierarchical unfolding the granularity of HAT and ITG at node level, chip level; and CPU/GPU level, all performing in conjunction with each other. The scheduler will also utilize communication avoiding (CA) and synchronization overhead reduction techniques, while determining task-to-core mappings and appropriate voltage-frequency settings.
2. We will design a common and auto-tunable framework for Krylov subspace solvers wherein each node of the ITG is the factorization of a sparse submatrix and the edges represent data movement. We will develop a suite of auto-tunable basic building blocks that will support this workflow. These building blocks will include support for matrix factorization computed in parallel on a variable number of CPU cores and/or coprocessor cores.
3. The work will include a decision supp