Abusing Vector Search for Texts, Maps, and Chess 鈾燂笍

Vector Search is hot! Everyone is pouring resources into a seemingly new and AI-related topic. But are there any non-AI-related use cases? Are there features you want from your vector search engine, but are too afraid to ask? Last week was 馃敟 for vector search. Weaviate raised $50M, and Pinecone raised $100M... That's a lot and makes you believe that vector search is hard. But it's not. I have spent the last few days implementing a single-file vector search engine....

May 9, 2023 路 10 min 路 2077 words 路 Ashot Vardanian

How Junior and Senior C++ Devs Locate Unique Strings

Some of the most common questions in programming interviews are about strings - reversing them, splitting, joining, counting, etc. These days, having to interview more and more developers across the whole spectrum, we see how vastly the solutions, even to the most straightforward problems, differ depending on experience. Let鈥檚 imagine a test with the following constraints: You must find the first occurrence of every unique string in a non-empty array. You are only allowed to use the standard library, no other dependencies....

May 9, 2023 路 8 min 路 1641 words 路 Ashot Vardanian

Mastering C++ with Google Benchmark 鈴憋笍

Very few consider C++ attractive, and hardly anyone thinks it鈥檚 easy. Choosing it for a project generally means you care about the performance of your code. And rightly so! Today machines can process hundreds of Gigabytes per second, and we, as developers, should all learn to saturate those capabilities. So let鈥檚 look into a few simple code snippets and familiarize ourselves with Google Benchmark (GB) - the most famous library in the space....

March 4, 2022 路 12 min 路 2555 words 路 Ashot Vardanian

Failing to Reach DDR4 Bandwidth 馃殞

A bit of history. Not so long ago, we tried to use GPU acceleration from Python. We benchmarked NumPy vs CuPy in the most common number-crunching tasks. We took the highest-end desktop CPU and the highest-end desktop GPU and put them to the test. The GPU, expectedly, won, but not just in Matrix Multiplications. Sorting arrays, finding medians, and even simple accumulation was vastly faster. So we implemented multiple algorithms for parallel reductions in C++ and CUDA, just to compare efficiency....

January 29, 2022 路 6 min 路 1215 words 路 Ashot Vardanian

Crushing CPUs with 879 GB/s Reductions in CUDA

GPU acceleration can be trivial for Python users. Follow CUDA installation steps carefully, replace import numpy as np with import cupy as np, and you will often get the 100x performance boosts without breaking a sweat. Every time you write magical one-liners, remember a systems engineer is making your dreams come true. A couple of years ago, when I was giving a talk on the breadth of GPGPU technologies, I published a repo....

January 28, 2022 路 10 min 路 1996 words 路 Ashot Vardanian

Apple to Apple Comparison: M1 Max vs Intel 馃崗

This will be a story about many things: about computers, about their (memory) speed limits, about very specific workloads that can push computers to those limits and the subtle differences in Hash-Tables (HT) designs. But before we get in, here is a glimpse of what we are about to see. A friendly warning, the following article contains many technical terms and is intended for somewhat technical and hopefully curious readers....

December 21, 2021 路 8 min 路 1618 words 路 Ashot Vardanian

Hyperscaler Shopping List: 2022 Data Center Tech Frenzy 鈽侊笍

A single software company can spend over 馃挷10 Billion/year, on data centres, but not every year is the same. When all stars align, we see bursts of new technologies reaching the market simultaneously, thus restarting the purchasing super-cycle. 2022 will be just that, so let鈥檚 jump a couple of quarters ahead and see what鈥檚 on the shopping list of your favorite hyperscaler! Friendly warning: this article is full of technical terms and jargon, so it may be hard to read if you don鈥檛 write code or haven鈥檛 assembled computers before....

December 7, 2021 路 15 min 路 3003 words 路 Ashot Vardanian

Only 1% of Software Benefits from SIMD Instructions

David Patterson had recently mentioned that (rephrasing): The programmers may benefit from using complex instruction sets directly, but it is increasingly challenging for compilers to automatically generate them in the right spots. In the last 3-4 years I gave a bunch of talks on the intricacies of SIMD programming, highlighting the divergence in hardware and software design in the past ten years. Chips are becoming bigger and more complicated to add more functionality, but the general-purpose compilers like GCC, LLVM, MSVC and ICC cannot keep up with the pace....

November 21, 2021 路 7 min 路 1406 words 路 Ashot Vardanian

Come to Armenia 馃嚘馃嚥

Borders are closed, people are sitting at home, but I bet most of you dream about traveling again. I want to invite you all to my country of origin - Armenia. It has something to offer to every group of people - tourists, entrepreneurs and investors! ...

August 1, 2020 路 8 min 路 1680 words 路 Ashot Vardanian