Elasticsearch For Beginners: Indexing your Gmail Inbox
What's this all about?
I recently looked at my Gmail inbox and noticed that I have well over 50k emails, taking up about 12GB of space but there is no good way to tell what emails take up space, who sent them to, who emails me, etc
Goal of this tutorial is to load an entire Gmail inbox into Elasticsearch using bulk indexing and then start querying the cluster to get a better picture of what's going on.
Probablistic filters are high-speed, space-efficient data structures that support set-membership tests with a one-sided error. These filters can claim that a given entry is definitely not represented in a set of entries, or might be represented in the set. That is, negative responses are conclusive, whereas positive responses incur a small false positive probability (FPP).
The trade-off for this one-sided error is space-efficiency. Cuckoo Filters and Bloom Filters require approximately 7 bits per entry at 3% FPP, regardless of the size of the entries. This makes them useful for applictations where the volume of original data makes traditional storage impractical.
Bloom filters have been in use since the 1970s and are well understood. Implementations are widely available. Variants exist that support deletion and counting, though with expanded storage requirements.
Cuckoo filters were described in Cuckoo Filter: Practically Better Than Bloom, a paper by researchers at CMU in 2014. Cuckoo filters improve on Bloom filters by supporting deletion, limited counting, and bounded FPP with similar storage efficiency as a standard Bloom filter.
Below is side-by-side simulation of the inner workings of Cuckoo and Bloom filters.
GitHub - apple/turicreate: Turi Create simplifies the development of custom machine learning models.
Turi Create simplifies the development of custom machine learning models.
Quantum Development Kit
An AngularJS 1.x WebSocket service for connecting client applications to servers.
Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.
Flight rules for git
Natural language processing
container-diff: Diff your Docker containers
The Yocto Project is an open source collaboration project that provides templates, tools and methods to help you create custom Linux-based systems for embedded products regardless of the hardware architecture. It was founded in 2010 as a collaboration among many hardware manufacturers, open-source operating systems vendors, and electronics companies to bring some order to the chaos of embedded Linux development.
Vorlesung „Programmieren in Rust“ an der Universität Osnabrück im Wintersemester 2016/17. Slides und weitere Informationen: https://github.com/LukasKalbertodt/programmieren-in-rust
Which is exactly what we've been doing. Like Scala, Vue works really, really well, when used properly. It turns out Vue isn't a buzzword, Vue is a workhorse. A lot of our problems have been solved, by us and others. We still have problems but, we now have a reproducible "way to write Vue." We don't adopt every new idea out there, but we have changed a few things since we last spoke.
vavr - turns java™ upside down
In functional-style programming, functions may both receive and return other functions. Instead of a function being simply a factory or producer of an object, as in traditional object-oriented programming, it is also able to create and return another function. Functions returning functions can result in cascading lambdas, especially in highly concise code. While this syntax may look quite strange at first, it has its uses. This article will help you recognize cascading lambdas and understand their nature and purpose in code.
Angular Style Guide: A starting point for Angular development teams to provide consistency through good practices.
This second part of the Stateless Spring Security series is about exploring means of authentication in a stateless way. If you missed the first part about CSRF you can find it here.
So when talking about Authentication, its all about having the client identify itself to the server in a verifiable manner. Typically this start with the server providing the client with a challenge, like a request to fill in a username / password. Today I want to focus on what happens after passing such initial (manual) challenge and how to deal with automatic re-authentication of futher HTTP requests.
Gladys is an open-source program which runs on your Raspberry Pi.
It communicates with all your devices and checks your calendar to help you in your everyday life.
A brief introduction to common Git workflows including the Centralized Workflow, Feature Branch Workflow, Gitflow Workflow, and Forking Workflow.
his script will convert notes from Synology Note Station to plain-text markdown notes. The script is written in Python and should work on any desktop platform. It's tested on Linux and Windows 7.