Follow
Scott Levy
Title
Cited by
Cited by
Year
Using simulation to evaluate the performance of resilience strategies at scale
S Levy, B Topp, KB Ferreira, D Arnold, T Hoefler, P Widener
High Performance Computing Systems. Performance Modeling, Benchmarking and …, 2014
402014
Lessons learned from memory errors observed over the lifetime of Cielo
S Levy, KB Ferreira, N DeBardeleben, T Siddiqua, V Sridharan, ...
SC18: International Conference for High Performance Computing, Networking …, 2018
382018
Understanding the effects of communication and coordination on checkpointing at scale
KB Ferreira, P Widener, S Levy, D Arnold, T Hoefler
SC'14: Proceedings of the International Conference for High Performance …, 2014
352014
Understanding performance interference in next-generation HPC systems
OH Mondragon, PG Bridges, S Levy, KB Ferreira, P Widener
SC'16: Proceedings of the International Conference for High Performance …, 2016
312016
Lifetime memory reliability data from the field
T Siddiqua, V Sridharan, SE Raasch, N DeBardeleben, KB Ferreira, ...
2017 IEEE International Symposium on Defect and Fault Tolerance in VLSI and …, 2017
262017
Faodel: Data management for next-generation application workflows
C Ulmer, S Mukherjee, G Templet, S Levy, J Lofstead, P Widener, ...
Proceedings of the 9th Workshop on Scientific Cloud Computing, 1-6, 2018
212018
Characterizing MPI matching via trace-based simulation
KB Ferreira, S Levy, K Pedretti, RE Grant
Proceedings of the 24th European MPI Users' Group Meeting, 1-11, 2017
212017
Improving dram fault characterization through machine learning
E Baseman, N DeBardeleben, K Ferreira, S Levy, S Raasch, V Sridharan, ...
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems …, 2016
192016
Exploring the effect of noise on the performance benefit of nonblocking allreduce
P Widener, KB Ferreira, S Levy, T Hoefler
Proceedings of the 21st European MPI Users' Group Meeting, 77-82, 2014
152014
Empress: extensible metadata provider for extreme-scale scientific simulations
M Lawson, C Ulmer, S Mukherjee, G Templet, J Lofstead, S Levy, ...
Proceedings of the 2nd Joint International Workshop on Parallel Data Storage …, 2017
142017
Using unreliable virtual hardware to inject errors in extreme-scale systems
S Levy, MGF Dosanjh, PG Bridges, KB Ferreira
Proceedings of the 3rd Workshop on Fault-tolerance for HPC at extreme scale …, 2013
132013
An examination of the impact of failure distribution on coordinated checkpoint/restart
S Levy, KB Ferreira
Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale …, 2016
102016
Evaluating the feasibility of using memory content similarity to improve system resilience
S Levy, PG Bridges, KB Ferreira, AP Thompson, C Trott
Proceedings of the 3rd International Workshop on Runtime and Operating …, 2013
102013
Hardware MPI message matching: Insights into MPI matching behavior to inform design
K Ferreira, RE Grant, MJ Levenhagen, S Levy, T Groves
Concurrency and Computation: Practice and Experience 32 (3), e5150, 2020
92020
Using simulation to examine the effect of MPI message matching costs on application performance
S Levy, KB Ferreira
Proceedings of the 25th European MPI Users' Group Meeting, 1-11, 2018
92018
Scheduling in-situ analytics in next-generation applications
OH Mondragon, PG Bridges, S Levy, KB Ferreira, P Widener
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016
92016
Using simulation to evaluate the performance of resilience strategies and process failures
SN Levy, BE Topp, DC Arnold, KB Ferreira, P Widener, T Hoefler
Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2014
92014
“Smarter” NICs for faster molecular dynamics: a case study
S Karamati, C Hughes, KS Hemmert, RE Grant, WW Schonbein, S Levy, ...
2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2022
82022
On noise and the performance benefit of nonblocking collectives
PM Widener, S Levy, KB Ferreira, T Hoefler
The International Journal of High Performance Computing Applications 30 (1 …, 2016
82016
RaDD runtimes: Radical and different distributed runtimes with smartnics
RE Grant, W Schonbein, S Levy
2020 IEEE/ACM Fourth Annual Workshop on Emerging Parallel and Distributed …, 2020
72020
The system can't perform the operation now. Try again later.
Articles 1–20