More recent articles: see
Google Scholar.
2018-10
Randen - fast backtracking-resistant random generator with AES+Feistel+Reverie (
Github)
Unpredictable and backtracking-resistant random number generation faster than MT19937.
2016-03
Fast strong hash functions: SipHash/HighwayHash (Github)
High-speed hashes with thorough mixing and near-cryptographic strength.
Provides SipHash and a tree version (1.5x and 4.2x speedup) plus
"HighwayHash" (10x speedup).
2011-10
Efficient Algorithms for Large-Scale Image AnalysisPhD
thesis demonstrating the feasibility of analyzing gigapixel images
within minutes on a single workstation. Introduces seven new algorithms
for various stages of the analysis pipeline that outperfom
previous techniques by factors of 10-100 while maintaining output quality.
2011-09
Engineering the Ideal Gigapixel Image Viewer [
bibtex]
Smooth pan and zoom in gigapixel images via lossless compression, asynchronous I/O and shaders.
2011-08-31
Lossless asymmetric single instruction multiple data codec [
bibtex]
Novel SIMD predictor and entropy coder: 50% compression and 3 GB/s (per core) decompression.
2011-08
Engineering a Multi-Core Radix Sort [
bibtex]
Expanded version published at EuroPar 2011; 10% speedup vs. 2010-08-17 technical report.
2010-09
Highly optimized weighted-IHS pan sharpening with edge-preserving denoising [
bibtex]
Pan sharpening: fast (100 MPixel/s) and high-quality (reduced noise, adaptive weights).
2010-09
Fast, High-Quality Line Antialiasing by Prefiltering with an Optimal Cubic Polynomial [
bibtex]
Software line rasterizer with optimal low-pass filter; outperforms mid-range GPUs.
2010-09
Highly Efficient Screening for Point-Like Targets via Concentric Shells [
bibtex]
Asymptotically optimal pipelined divide and conquer algorithm for finding point-like objects.
2010-08-17
Faster Radix Sort via Virtual Memory and Write-Combining [
bibtex]
Sort throughput > 88% of memory bandwidth (1.24x speedup vs. a Fermi GPU).
2009-03-27
An
Efficient Parallel Algorithm for Graph-Based Image Segmentation [
bibtex]
Fast but high-quality image segmentation, made possible by a new
parallel algorithm that
doesn't just chop images into tiles.
Additional Information:
Paper,
Presentation, Poster
2008-02
Determination of Maximally Stable Extremal Regions in Large Images [
bibtex]
Efficient algorithm for extracting MSERs (e.g. for image segmentation).
2007-06-10
Timing
Pitfalls and Solutions [171 KB]
Describes PC timing hardware, their pitfalls concerning reliable,
high-resolution timing, and a solution.
2007-03-29
Automatische
Gebäudemodellierung aus Laserscanning-Daten [DE,
2443 KB]
Diploma thesis: an algorithm for automatic building reconstruction from
point clouds.
Additional Information:
Presentation
and Video
2006-03-26
Speeding up
Memory Copy [135 KB]
A drop-in replacement for VC7.1's memcpy that is 3.5 times as
fast on an Athlon XP.
2006-04-07
Optimizing
File Accesses via Ordering and Caching [145 KB]
Study thesis: how to speed up file loading by a factor of 10.
2002-11-10
Introduction
to Program Optimization [12 KB]
A quick rundown on optimizing for size and speed.