Volgen
Venmugil Elango
Venmugil Elango
Microsoft
Geverifieerd e-mailadres voor osu.edu
Titel
Geciteerd door
Geciteerd door
Jaar
Diesel: DSL for linear algebra and neural net computations on GPUs
V Elango, N Rubin, M Ravishankar, H Sandanagobalane, V Grover
Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine …, 2018
632018
Distributed memory code generation for mixed irregular/regular computations
M Ravishankar, R Dathathri, V Elango, LN Pouchet, J Ramanujam, ...
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of …, 2015
412015
On characterizing the data access complexity of programs
V Elango, F Rastello, LN Pouchet, J Ramanujam, P Sadayappan
Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of …, 2015
312015
Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs
PW Lai, H Arafat, V Elango, P Sadayappan
20th Annual international conference on high performance computing, 139-148, 2013
302013
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
N Fauzia, V Elango, M Ravishankar, J Ramanujam, F Rastello, A Rountev, ...
ACM Transactions on Architecture and Code Optimization (TACO) 10 (4), 1-29, 2013
282013
Spatial adaptive sampling in multiscale simulation
B Rouet-Leduc, K Barros, E Cieren, V Elango, C Junghans, T Lookman, ...
Computer Physics Communications 185 (7), 1857-1864, 2014
272014
On characterizing the data movement complexity of computational DAGs for parallel execution
V Elango, F Rastello, LN Pouchet, J Ramanujam, P Sadayappan
Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and …, 2014
182014
Accelerating linear algebra kernels for any processor architecture
V Elango, N Rubin, M Ravishankar, VK Grover
US Patent App. 16/277,661, 2019
152019
Data Access Complexity: The Red/Blue Pebble Game Revisited
V Elango, F Rastello, LN Pouchet, J Ramanujam, P Sadayappan
152013
On using the roofline model with lower bounds on data movement
V Elango, N Sedaghati, F Rastello, LN Pouchet, J Ramanujam, ...
ACM Transactions on Architecture and Code Optimization (TACO) 11 (4), 1-23, 2015
122015
With shared microexponents, a little shifting goes a long way
B Darvish Rouhani, R Zhao, V Elango, R Shafipour, M Hall, ...
Proceedings of the 50th Annual International Symposium on Computer …, 2023
112023
Pase: Parallelization strategies for efficient dnn training
V Elango
2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021
62021
Microscaling data formats for deep learning
BD Rouhani, R Zhao, A More, M Hall, A Khodamoradi, S Deng, ...
arXiv preprint arXiv:2310.10537, 2023
52023
Techniques for Characterizing the Data Movement Complexity of Computations
V Elango
The Ohio State University, 2016
32016
Hierarchical and shared exponent floating point data types
BD Rouhani, V Elango, R Shafipour, J Fowers, MG Liu, J Xi, DC Burger, ...
US Patent 11,886,833, 2024
2024
Systems and methods for sparse matrix multiplication
V Elango, BD Rouhani, ES Chung, DC Burger
US Patent App. 17/657,912, 2023
2023
Microscaling Data Formats for Deep Learning
B Darvish Rouhani, R Zhao, A More, M Hall, A Khodamoradi, S Deng, ...
arXiv e-prints, arXiv: 2310.10537, 2023
2023
Accelerating linear algebra kernels for any processor architecture
V Elango, N Rubin, M Ravishankar, V Grover
US Patent App. 18/136,233, 2023
2023
Shared Microexponents: A Little Shifting Goes a Long Way
B Rouhani, R Zhao, V Elango, R Shafipour, M Hall, M Mesmakhosroshahi, ...
arXiv preprint arXiv:2302.08007, 2023
2023
Sparsifying narrow data formats for neural networks
BD Rouhani, V Elango, ES Chung, DC Burger, MC Heddes, S Nishit, ...
US Patent App. 17/349,848, 2022
2022
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20