Yet another accelerated sgd: Resnet-50 training on imagenet in 74.7 seconds M Yamazaki, A Kasagi, A Tabuchi, T Honda, M Miwa, N Fukumoto, ... arXiv preprint arXiv:1903.12650, 2019 | 99 | 2019 |
Understanding storage traffic characteristics on enterprise virtual desktop infrastructure C Lee, T Kumano, T Matsuki, H Endo, N Fukumoto, M Sugawara Proceedings of the 10th ACM International Systems and Storage Conference, 1-11, 2017 | 74 | 2017 |
Optimizing power-performance trade-off for parallel applications through dynamic core and frequency scaling S Imamura, H Sasaki, N Fukumoto, K Inoue, K Murakami Proceedings of the RESoLVE 12, 2012 | 12 | 2012 |
MLPerf™ HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems S Farrell, M Emani, J Balma, L Drescher, A Drozd, A Fink, G Fox, D Kanter, ... 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing …, 2021 | 11 | 2021 |
Analyzing the impact of data prefetching on Chip MultiProcessors N Fukumoto, T Mihara, K Inoue, K Murakami 2008 13th Asia-Pacific Computer Systems Architecture Conference, 1-8, 2008 | 8 | 2008 |
3D implemented SRAM/DRAM hybrid cache architecture for high-performance and low power consumption K Inoue, S Hashiguchi, S Ueno, N Fukumoto, K Murakami 2011 IEEE 54th International Midwest Symposium on Circuits and Systems …, 2011 | 5 | 2011 |
Preliminary performance analysis of distributed DNN training with relaxed synchronization K Shirahata, A Haderbache, N Fukumoto, K Nakashima IEICE Transactions on Electronics 104 (6), 257-260, 2021 | 2 | 2021 |
SRAM/DRAM ハイブリッド・キャッシュにおける実行時動作モード決定法の提案 橋口慎哉, 福本尚人, 井上弘士, 村上和彰 研究報告計算機アーキテクチャ (ARC) 2011 (9), 1-6, 2011 | 2 | 2011 |
Performance balancing: software-based on-chip memory management for effective CMP executions N Fukumoto, K Imazato, K Inoue, K Murakami Proceedings of the 10th workshop on MEmory performance: DEaling with …, 2009 | 2 | 2009 |
適応的ヘルパースレッド実行に基づくマルチコア向け演算/メモリ性能バランシング 今里賢一, 福本尚人, 井上弘士, 村上和彰 研究報告システムソフトウェアとオペレーティング・システム (OS) 2009 (16), 1-8, 2009 | 2 | 2009 |
Performance analysis of multi-containerized MD simulations for low-level resource allocation S Okuno, A Hirai, N Fukumoto 2022 IEEE International Parallel and Distributed Processing Symposium …, 2022 | 1 | 2022 |
A traffic-aware memory-cube network using bypassing Y Shikama, R Kawano, H Matsutani, H Amano, Y Nagasaka, N Fukumoto, ... Microprocessors and Microsystems 90, 104471, 2022 | 1 | 2022 |
mpiQulacs: A Distributed Quantum Computer Simulator for A64FX-based Cluster Systems S Imamura, M Yamazaki, T Honda, A Kasagi, A Tabuchi, H Nakao, ... arXiv preprint arXiv:2203.16044, 2022 | 1 | 2022 |
The 16,384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer A Tabuchi, K Shirahata, M Yamazaki, A Kasagi, T Honda, K Kurihara, ... 2021 IEEE 28th International Conference on High Performance Computing, Data …, 2021 | 1 | 2021 |
Towards straggler-tolerant and accuracy-aware distributed DNN training in clouds S Okuno, M Miwa, N Fukumoto 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet …, 2021 | 1 | 2021 |
Low-latency low-energy memory-cube networks using dual-voltage datapaths Y Shikama, R Kawano, H Matsutani, H Amano, Y Nagasaka, N Fukumoto, ... 2021 29th Euromicro International Conference on Parallel, Distributed and …, 2021 | 1 | 2021 |
電力制約下における最適なアプリケーション実行パラメータの導出手法 小野美由紀, 福本尚人, 本田巧, 中島耕太 研究報告ハイパフォーマンスコンピューティング (HPC) 2017 (4), 1-7, 2017 | 1 | 2017 |
Knights Landing における電力制約下での性能挙動の分析 小野美由紀, 福本尚人, 中島耕太 研究報告ハイパフォーマンスコンピューティング (HPC) 2017 (22), 1-6, 2017 | 1 | 2017 |
コア数と動作周波数の動的変更によるメニーコア・プロセッサ性能向上手法の提案 今村智史, 佐々木広, 福本尚人, 井上弘士, 村上和彰 情報処理学会論文誌コンピューティングシステム (ACS) 5 (4), 24-35, 2012 | 1 | 2012 |
3 次元積層 LSI 向け SRAM/DRAM ハイブリッドキャッシュ・アーキテクチャ 上野伸也, 橋口慎哉, 福本尚人, 井上弘士, 村上和彰 情報処理学会論文誌コンピューティングシステム (ACS) 5 (1), 41-52, 2012 | 1 | 2012 |