Name
Papers
Collaborators
MACIEJ BESTA
39
139
Citations 
PageRank 
Referers 
193
18.04
655
Referees 
References 
1999
794
Search Limit
1001000
Title
Citations
PageRank
Year
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching10.372022
I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication00.342022
Motif Prediction with Graph Neural Networks00.342022
SeBS: a serverless benchmark suite for function-as-a-service computing20.392021
On the parallel I/O optimality of linear algebra kernels: near-optimal matrix factorizations00.342021
GraphMineSuite: enabling high-performance and programmable graph mining algorithms with set algebra10.352021
The future is big graphs: a community view on graph processing systems10.402021
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems40.382021
Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs00.342021
On the parallel I/O optimality of linear algebra kernels: near-optimal LU factorization00.342021
High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks10.402021
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra.00.342021
Parallel Algorithms for Finding Large Cliques in Sparse Graphs10.352021
Substream-Centric Maximum Matchings on FPGA00.342020
High-performance parallel graph coloring with strong guarantees on work, depth, and quality00.342020
FatPaths: routing in supercomputers and data centers when shortest paths fall short00.342020
Slim graph: practical lossy graph compression for approximate graph processing, storage, and analytics00.342019
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning30.462019
Substream-Centric Maximum Matchings on FPGA.10.402019
FatPaths: Routing in Supercomputers, Data Centers, and Clouds with Low-Diameter Networks when Shortest Paths Fall Short.00.342019
Graph Processing on FPGAs: Taxonomy, Survey, Challenges.00.342019
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication60.482019
Network-accelerated non-contiguous memory transfers00.342019
Enabling highly scalable remote memory access programming with MPI-3 one sided.00.342018
Log(graph): a near-optimal high-performance graph representation00.342018
Survey and Taxonomy of Lossless Graph Compression and Space-Efficient Graph Representations.00.342018
Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability.20.352018
Communication-avoiding parallel minimum cuts and connected components.20.362018
To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations.260.732017
SlimSell: A Vectorizable Graph Representation for Breadth-First Search60.472017
Scaling betweenness centrality using communication-efficient sparse matrix multiplication60.412017
High-Performance Distributed RMA Locks.10.352016
Betweenness Centrality is more Parallelizable than Dense Matrix Multiplication.00.342016
Evaluating the Cost of Atomic Operations on Modern Architectures.160.842015
Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations30.402015
Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages30.372015
Slim Fly: A Cost Effective Low-Diameter Network Topology652.122014
Fault tolerance for remote memory access programming models80.512014
Enabling highly-scalable remote memory access programming with MPI-3 one sided341.412013