Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures W Zhu, VC Sreedhar, Z Hu, GR Gao ACM SIGARCH Computer Architecture News 35 (2), 35-45, 2007 | 135 | 2007 |
Minimum lock assignment: A method for exploiting concurrency among critical sections Y Zhang, V Sreedhar, W Zhu, V Sarkar, G Gao Languages and Compilers for Parallel Computing, 141-155, 2008 | 90* | 2008 |
ParalleX: A study of a new parallel computation model GR Gao, T Sterling, R Stevens, M Hereld, W Zhu Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE …, 2007 | 88 | 2007 |
FAST: A functionally accurate simulation toolset for the Cyclops64 cellular architecture J Del Cuvillo, W Zhu, Z Hu, GR Gao Workshop on Modeling, Benchmarking, and Simulation (MoBS2005), in conjuction …, 2005 | 85 | 2005 |
TiNy Threads: A thread virtual machine for the Cyclops64 cellular architecture J Del Cuvillo, W Zhu, Z Hu, GR Gao Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE …, 2005 | 68 | 2005 |
Toward a software infrastructure for the cyclops-64 cellular architecture J del Cuvillo, W Zhu, Z Hu, GR Gao High-Performance Computing in an Advanced Collaborative Environment, 2006 …, 2006 | 65 | 2006 |
Toward a software infrastructure for the Cyclops-64 cellular architecture J Cuvillo, W Zhu, Z Hu, GR Gao In the 20th International Symposium on High Performance Computing Systems …, 2006 | 65* | 2006 |
Transactional memory compatibility management D Groff, Y Levanoni, S Toub, MMK Magruder, W Zhu, TL Harris, CW Dern, ... US Patent 8,266,604, 2012 | 55 | 2012 |
Landing openmp on cyclops-64: An efficient mapping of openmp to a many-core system-on-a-chip J Del Cuvillo, W Zhu, G Gao Proceedings of the 3rd conference on Computing frontiers, 41-50, 2006 | 46 | 2006 |
Optimization of dense matrix multiplication on IBM Cyclops-64: Challenges and experiences Z Hu, J del Cuvillo, W Zhu, G Gao Euro-Par 2006 Parallel Processing, 134-144, 2006 | 45 | 2006 |
Compiler-generated invocation stubs for data parallel programming model L Zhang, W Zhu, Y Levanoni, PF Ringseth, DCII Charles US Patent 8,589,867, 2013 | 43 | 2013 |
Binding data parallel device source code W Zhu, L Zhang, SS Sodhi, Y Levanoni US Patent 8,756,590, 2014 | 30 | 2014 |
Optimizing execution of kernels W Zhu, AK Agarwal, L Zhang, Y Levanoni US Patent 8,533,698, 2013 | 28 | 2013 |
Data Parallel Programming Model DCII Charles, PF Ringseth, Y Levanoni, W Zhu, L Zhang US Patent App. 12/819,097, 2010 | 28* | 2010 |
Binding executable code at runtime AK Agarwal, W Zhu, Y Levanoni US Patent 8,468,507, 2013 | 19 | 2013 |
Performance Characteristics of OpenMP Language Constructs on a Many-core-on-a-chip Architecture W Zhu, J Del Cuvillo, G Gao OpenMP Shared Memory Parallel Programming, 230-241, 2008 | 19 | 2008 |
Optimized lock assignment and allocation: A method for exploiting concurrency among critical sections Y Zhang, VC Sreedhar, W Zhu, V Sarkar, GR Gao Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of …, 2007 | 19 | 2007 |
Debugging in a multiple address space environment AK Agarwal, W Zhu, Y Levanoni, Y Zhu US Patent 8,677,322, 2014 | 16 | 2014 |
Performance portability on EARTH: a case study across several parallel architectures W Zhu, Y Niu, GR Gao Cluster Computing 10 (2), 115-126, 2007 | 16 | 2007 |
Implementing parallel hmm-pfam on the EARTH multithreaded architecture W Zhu, Y Niu, J Lu, GR Gao Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE, 549-550, 2003 | 16 | 2003 |