If you no longer want Facebook tracking, Openbook may be right for you.
Openbook is the name of the next social network to compete with Facebook. And if you’re tired of the way Facebook uses (but doesn’t sell) your data to target advertising, you might be interested.
Indeed, as the Financial Times reports, which reveals this new project, the creators of this future platform want to do better than Facebook regarding data protection. First of all, the site will be open source, which means that it will be possible to analyze its code to study how it works. And unlike Facebook, OpenBook will not track its users, and will not earn money through advertising.
Other differences could distinguish OpenBook from its rival with 2 billion users. “It’s really about building a social network that respects the privacy of its users, that’s the main driver for me. But we realized that if we wanted to succeed, we needed to bring more to the table, we didn’t just want to build a Facebook clone,” said Joel Hernandez, the boss of the new initiative. According to the FT, he would have liked to create a less invasive alternative to Facebook long ago but would have decided to act now to take advantage of the Cambridge Analytica effect on public opinion regarding privacy. He would plan to bring more customizations and elements that will make OpenBook “more fun.”
The Openbook initiative is also supported by Philip Zimmermann, the inventor of the PGP encryption program.
A marketplace to replace advertising
To make money, OpenBook will adopt a different business model than Facebook. Indeed, instead of using user data to target advertising, it should launch a kind of marketplace on which it will be possible to sell (and through which Openbook can charge commissions).
But for now, the initiative will first have to go through the crowdfunding box. Indeed, instead of a classic fundraiser, its creators have decided to launch a participatory fundraising campaign on Kickstarter, which should start Tuesday. If all goes well, the first to access the Openbook beta should be the contributors to this campaign.
Secondly, Openbook could also take advantage of the data portability imposed by the DGMS to allow new users to transfer their personal information from Twitter or Facebook to the new platform.
If the idea of creating an alternative to Facebook with more privacy can only be laudable, it also remains to be seen whether the mass will adopt this new platform. Other sites have already tried to make their place in the social networking landscape. But most of these have failed. We could cite the example of 800 pound Gorilla, an open source and decentralized Twitter rival who made the buzz for a week, before falling into oblivion.
I’ve decided to jump on the bandwagon and spill my thoughts on “NoSQL” since it’s been such a hot topic lately (, , , ). Since I work on the Drizzle project some folks would probably think I take the SQL side of the “debate,” but actually I’m pretty objective about the topic and find value in projects on both sides. Let me explain.
Last November at OpenSQL Camp I assembled a panel to debate “SQL vs NoSQL.” We had folks representing a variety of projects, including Cassandra, CouchDB, Drizzle, MariaDB, MongoDB, MySQL, and PostgreSQL.
Even though I realized this was a poor name for such a panel, I went with it anyways because this “debate” was really starting to heat up. The conclusion I was hoping for is that the two are not at odds because the two categories of projects can peacefully co-exist in the same toolbox for data management. Beyond the panel name, even the term “NoSQL” is a bit misleading. I talked with Eric Evans (one of my new co-workers over on the Cassandra team) who reintroduced the term, and even he admits it is vague and doesn’t do the projects categorized by it any favors. What happens when Cassandra has a SQL interface stacked on top of it? Yeah.
One reason for all this confusion is that for some people, the term “database” equates to “relational database.” This makes the non-relational projects look foreign because they don’t fit the database model that became “traditional” due it’s popularity. Anyone who has ever read up on other database models would quickly realize relational is just one of many models, and many of the “NoSQL” projects fit quite nicely into one of these categories.
The real value these new projects are providing are in their implementation details, especially with dynamic scale-out (adding new nodes to live systems) and synchronization mechanisms (eventual consistency or tunable quorum). There are a lot of great ideas in these projects, and people on the “SQL” side should really take the time to study them – there are some tricks to learn.
Square Peg, Round Hole
One of the main criticisms of the “NoSQL” projects is that they are taking a step back, simply reinventing a component that already exists in a relational model. While this may have some truth, if you gloss over the high-level logical data representations, this is just wrong. Sure, it may look like a simple key-value store from the outside, but there is a lot more under the hood. For many of these projects it was a design decision to focus on the implementation details where it matters, and not bother with things like parsing SQL and optimizing joins.
I think there is still some value in supporting some form of a SQL interface because this gets you instant adoption by pretty much any developer out there. Love it or hate it, people know SQL. As for joins, scaling them with distributed relational nodes has been a research topic for years, and it’s a hard problem. People have worked around this by accepting new data models and consistency levels. It all depends on what your problem requires.
I fully embrace the “NoSQL” projects out there, there is something we can all learn from them even if we don’t put them into production. We should be thrilled we have more open source tools in our database toolbox, especially non-relational ones. We are no longer required to smash every dataset “peg” into the relational “hole.” Use the best tool for the job, this may still be a relational database. Explore your options, try to learn a few things, model your data in a number of ways, and find out what is really required. When it comes time to making a decision just remember: Dear everyone who is not Facebook: You are not Facebook.
Back in January when I was between jobs I had a free weekend to do some fun hacking. I decided to start a new open source project that had been brewing in the back of my head and since then have been poking at it on the weekends and an occasional late night. I decided to call it Scale Stack because it aims to provide a scalable network service stack. This may sound a bit generic and boring, but let me show a graph of a database proxy module I slapped together in the past couple days:
Database Proxy Graph
I setup MySQL 5.5.2-m2 and ran the sysbench read-only tests against it with 1-8192 threads. I then started up the database proxy module built on Scale Stack so sysbench would route through that, and you can see the concurrency improved quite a bit at higher thread counts. The database module doesn’t do much, it simply does connection concentration, mapping M to N connections, where N is a fixed parameter given at startup.
In this case I always mapped all incoming sysbench connections down to 128 connections between Scale Stack and MySQL. It also uses a fixed number of threads and is entirely non-blocking. As you can see the max throughput around 64 threads is a bit lower, but I’ve not done much to optimize this yet (there should be some easy improvements where I simply stuck in a mutex instead of doing a lockless queue). It’s only a simple proof-of-concept module to see how well this would work, but it’s a start to a potentially useful module built on the other Scale Stack components. One other thing to mention is that these tests were run on a single 16-core Intel machine. I’d really like to test this with multiple machines at some point.
So, what is Scale Stack?
Check out the website for a simple overview of what it is. The goal is to pick up where the operating system kernel leaves off with the network stack. It is written in C++ and is extremely modular with only the module loader, option parsing, and basic log in the kernel library. It uses Monty Taylor’s pandora-build autoconf files to provide a sane modular build system, along with some modifications I made so dependency tracking is done between modules. You can actually use it to write modules that would do anything, I’m just most interested in network service based modules.
The kernel/module loader is also just a library, so you can actually embed this into existing applications as well. Some of the modules I’ve written for it are a threaded event handling module based on libevent/pthreads and a TCP socket module. There is also an echo server and simple proxy module I created while testing the event and socket modules. The database proxy module builds on top of the event and socket module. The code is under the BSD license and is up on Launchpad, so feel free to check it out and contribute. If you need a base to build high-performance network services on, you should definitely take a look and talk with me.
What’s up next?
I have a long list of things I would like to do with this, but first up are still some basics. This includes other socket type modules like TLS/SSL, UDP, and Unix sockets. Then are some more protocol modules such as Drizzle, a real MySQL protocol module, and others like HTTP, Gearman, and memcached. It’s fairly trivial to write these since the socket modules handle all buffering and provide a simple API. As for the DatabaseProxy module, I’d like to rework how things are now so it’s not MySQL protocol specific, integrate other protocol modules, improve performance, add in multi-tenancy support for quality-of-service queuing based on account rules, and a laundry list of other features I won’t bore you with right now.
I also have plans for other services besides a database proxy, especially one that could combine a number of protocols into a generic URI server with pluggable handlers so you can do some interesting translations between modules (like Apache httpd but not http-centric). For example, think of the crazy things you can do with Twisted for Python, but now with a fast, threaded C++ kernel. I also still need to experiment with live reloading of modules, but I’m not sure if this will be worthwhile yet.
If any of this sounds interesting, get in touch, I’d love to have some help! I’ll have some blog posts later on how to get started writing modules, but for now just take a look at the existing modules. The EchoServer is a good place to start since it is pretty simple. Also, if you’ll be at the MySQL Conference and Expo next week, I’d be happy to talk more about it then.
Since I announced SlackDB a few weeks ago, I’ve had a number of questions and interesting conversations in response. I thought I would summarize the initial feedback and answer some questions to help clarify things. One of the biggest questions was “Isn’t this what Drizzle is doing?”, and the answer is no. They are both being designed for “the cloud” and speak the MySQL protocol, but they provide very different guarantees around consistency and high-availability. The simple answer is that SlackDB will provide true multi-master configurations through a deterministic and idempotent replication model (conflicts will be resolved via timestamps), where Drizzle still maintains transactions and ACID properties, which imply single master. Drizzle could add support for clustered configurations and distributed transactions (like the NDB storage engine), but writes would still happen on the majority (maintain quorum) since the concept of global state needs to be maintained.
This led Mark Callaghan to ask why not just modify Drizzle to support these behaviors? He has a good point since most of the properties I’m talking about exist at the storage engine level. There are still a number of changes that would need to happen in the kernel around catalog, database, and table creation to support the replication model. SlackDB also won’t need a number of constructs provided by the Drizzle kernel (various locks, transaction support) so query processing can be lighter-weight. So while it’s probably possible with enough patches and plugins to make this work in Drizzle, I believe it will be easier (both socially and technically) to do this from scratch. With either approach there is still a fair amount of code to be written, and I’ve decided to use Erlang since it allows programmers to express ideas concisely and more quickly with an acceptable trade-off in runtime efficiency. This would make it even more difficult to integrate with Drizzle.
A couple folks asked why I chose the BSD license instead of GPL or Apache. I didn’t want a copyleft license, so GPL was out, but after chatting some more I decided to switch SlackDB to the Apache 2.0 license for the patent protection clause. As much as I dislike patents and would prefer not to acknowledge them, I figured having the protection clauses in there would make it less likely that anyone using the software would have to deal with them once there are other contributors who may hold patents.
I presented the techniques I’m using behind SlackDB in a session at OpenSQL Camp Boston last weekend, and overall they were well received. There was a lot of great feedback and suggestions about other projects and libraries doing related things that may help speed things along. I was glad to see I wasn’t the only person thinking about these properties for relational databases, as Josh Berkus of PostgreSQL fame also led a session on ordering events and conflict resolution within relational data when you loosen up consistency.
I also attended Surge in Baltimore and listened to a talk by Justin Sheehy about “Embracing Concurrency At Scale.” You can see another recording of the same talk here. Justin explained the concepts and problems with systems trying to maintain any kind of globally consistent state quite well, and I agree with almost everything in his presentation. This recent blog post by Coda Hale also explains some of the other key principles around what you must give up in order to get the level of availability required by most systems these days. These help explain the reasons why I started SlackDB – I’m trying to combine these properties with a relational data model. Right now I’m still only able to put my limited spare time into it, but I’m hoping to find a way to put more time into the project. Hopefully you will agree we need a database like this and will help out too. 🙂
OpenStack currently consists of three main components: Nova (Compute), Swift (Object Storage), and Glance (Image Service). There are some other projects such as a dashboard and mobile apps as well. You can see the full list here. This is great start, but in order for OpenStack to compete long term other infrastructure and platform services will need to be brought in. I’d like to talk about the process I’m taking with a new message queue service.
Step 1 – Idea
The first step is to figure out what is missing. What new service would compliment the software already available? What hasn’t been solved yet? What are users asking for? A message queue seemed like an appropriate next step as most applications that need to scale out and be highly available will make use of a queue at some point (sometimes not in the most obvious form). It will also allow other cloud services to be built on top of it. In fact, the current OpenStack projects could even leverage a queue service for new features.
Step 2 – Initial requirements
Before you write up a proposal and send it out, it might be a good idea to gather some initial requirements and figure out what it may look like. Don’t worry about details as the community will help flush this out later. Some of the major requirements when thinking about OpenStack projects are horizontal scalability, multi-tenancy, modular API, REST API, zones and locality awareness, and no single points of failure (high availability). This is a pretty heavy set of requirements before even getting into service specifics, but this will help you think about how to approach a service. You may have to diverge away from traditional thinking for a particular service. For example, what worked in a rack or a data center may not work in the cloud. You need to account for this up front and state behavioral differences from what folks may expect. For the queue service, this meant not taking a traditional approach you see in some queue protocols and services, and instead integrating ideas from distributed services.
A multi-tenant cloud is a very different environment from what many people are used to and usually requires a different approach to solve problems. If folks tell you you’re re-inventing the wheel, take their concerns into consideration, but also realize you may not be. You may be writing a jet engine.
Step 3 – Wiki and Mailing List Proposal
Once you have a good idea and a rough outline, you’ll probably want to run it by a couple people for feedback before sending it to everyone. You’ll then want to create a new wiki page on the OpenStack wiki and send a note to the public mailing list that mentions the wiki page and asks for community feedback. For example, the queue service proposal I wrote can be found here. There is an enormous amount of collective experience and brain power on the mailing list which will help point out any issues with the proposal. The service you initially propose may look nothing like the service you actually build. It’s also quite possible the service you propose is not a good fit for the cloud or OpenStack. The community will help iron all these details out.
Step 4 – Wait
It can take folks a while to catch up on public mailing lists, so be patient. Let people know about the proposal by other means (blog, tweet, irc, …) and help facilitate the conversation as people respond.
Step 5 – Prototype
Once you feel the community is content with the proposal and it’s a viable idea (don’t expect consensus), prototype it! This shows the community you are serious and this exercise will help work out more issues in the proposal. Let the community know about it and again wait for any feedback. This doesn’t need to be anything fancy, for the queue service I put this together over a weekend.
Step 6 – Name and Language
Now comes the difficult part, choosing a project name. I’d suggest not using the mailing list for this as it will be a lot of noise for a matter that isn’t too important. Ask a couple folks who may also be interested for ideas and make sure it’s not already taken (search on github, Launchpad, Google, etc). For the queue service we decided on “Burrow”.
You’ll also need to figure out the most appropriate language. For middleware and services, Python is a good default. If efficiency is a concern, look at Erlang or C/C++. Be sure to send another mail to the list and ask for feedback. With the queue service I initially proposed C++ with Erlang as an alternative since efficiency is a major concern (especially around utilizing multiple cores), and the community came back mixed but with more enthusiasm for Erlang.
Step 7 – Bootstrap the Project on Launchpad
We’re using Launchpad for OpenStack project management. You’ll need to create a project and a number of groups to manage it. For example, the queue project can be found here. The groups have the following roles (replace burrow with your project name):
burrow – Public group that anyone can join. This currently includes members on the main OpenStack mailing list, but we’re setup this way in case we need to break projects out into their own list.
burrow-drivers – The group responsible for maintaining the project, managing blueprints, and making releases.
burrow-core – The group responsible for performing code reviews.
burrow-bugs – The group responsible for managing bugs.
Step 8 – Lock onto Releases and Milestone Schedule
While not important right away, it might be a good idea to start working with the OpenStack release cycle. Releases are currently every three months with milestones setup in each release for feature freeze, bug freeze, and releases. See the release page for more details. Launchpad makes it fairly simple to manage this, you’ll just want to create a new series (for example, “cactus” right now), and a couple milestones within that series for the freezing and release. Ask on the mailing list or on IRC if you need any help, but a good rule of thumb is to follow what other established projects do (like Nova).
Step 9 – Code!
Get to work and try to recruit other developers to help you. Keep the community updated with progress by using IRC, mailing list, planet.openstack.org, and tweets.
Step 10 – Submit to the Project Oversight Committee
Up until this point your project has not been an official OpenStack project. It is a well thought-out idea driven by the community that probably has a good chance though. You’ll need to make a proposal to the POC using this page once the project can stand on it’s own. You probably don’t need a final version, but you need something that is functional and more robust than a prototype. The POC meets weekly, although it may take more time (and some conversations) to decide if your project is ready. The queue service I’ve been driving has not been proposed since it’s not ready, so you may want to take all this with a grain of salt. It is my hope to have the first version ready to propose in April as part of the Cactus release.
This process will vary and can certainly be refined. I’m stating what I’ve done with a new project, but existing projects will obviously need to take a different route. The main idea to keep in mind though is that any OpenStack project should be seen as community driven, not just by an individual or company. One or more individuals may carry out a large part of the work of the community initially, but community concerns and feedback should always be taken with the utmost importance.
Walmart and Microsoft announced a five-year technology partnership to fight against Amazon’s influence.
At Microsoft’s Inspire conference in Las Vegas July 14-18, Walmart revealed that the two companies have sealed a strategic partnership. Based on the Cloud computing services offered by Microsoft, the agreement would allow its partner to benefit from the Azure application platform and Microsoft 365. Companies will also work on the development of new projects focused on artificial intelligence and service improvement using machine learning technology. For example, artificial intelligence solutions could help Walmart personalize its marketing campaigns according to Internet users.
Walmart is Amazon’s biggest retail competitor, while Microsoft is Amazon’s biggest rival when it comes to cloud services.
It was therefore logical that the two companies finally joined forces to try to limit the power of their common enemy.
Nevertheless, it is important to remember that the partnership comes at a time when Microsoft has expressed interest in designing a technology rivalling the one operated by physical Amazon Go stores. Announced in January 2018, Amazon’s first store successfully replaced staff with ambitious technologies based on artificial intelligence. Similarly, customers no longer have to wait in line because the cash registers no longer exist: now you pay with your Amazon account and mobile phone.
Microsoft, for its part, would be looking at similar technology that could operate cameras attached to shopping carts in stores. To date, the firm has already hired a computer vision specialist from Amazon. Moreover, Bill Gates’ company would be in talks with Walmart, which would certainly find its interest in this project since it would enable it to catch up with Bezos’ company.
However, this assumption was not made when the partnership was officially announced.
Samsung has unveiled its latest high-end smartphone. The Galaxy S9 relies on photography to stand out, with a variable aperture sensor and video slow motion at 960 frames per second.
The South Korean conglomerate Samsung officially presented its brand new high-end smartphone, the Galaxy S9, designed to integrate augmented reality, at the World Mobile Congress (MWC) in Barcelona. Samsung did not attend last year’s event and took advantage of a day where its competitors remained discreet to highlight its new flagship smartphone. The S9 will be equipped with a new type of optics to meet the augmented reality applications made by the Korean company.
The Galaxy S9 confirms the trend which is generalizing among smartphone manufacturers, to onboard screens on almost the entire front of the device. Available from June 20, it can also be converted into a computer, by connecting a number of accessories and will integrate the possibility of being unlocked by password, fingerprint, face recognition or retinal scanner. The camera focuses on photography with a variable aperture sensor and video slow motion at 960 frames per second.
Huawei still in the race
Earlier today, Chinese telecom giant Huawei introduced the latest version of its Matebook X Pro laptop and its new MediaPad M5 tablet. Both devices boast shorter charging times, longer battery life and improved technical capabilities compared to previous versions. They also incorporate simplified interconnectivity for the manufacturer’s various devices: computers, tablets and smartphones.
LG also took advantage of MWC’s standby to introduce an update of its V30S smartphone, which embed artificial intelligence (AI) through the ThinkQ software suite, the main innovation of the device. Like its local competitor Samsung, LG intends to integrate its own IA into all its device, smartphones of course, but also connected TVs and household appliances.
However, no new high-end phone to take over from LG G6, introduced during the last edition of MWC.
The two video-sharing sites are waging a merciless war in the heart of Internet users. If the leading video platform Youtube seems to have supremacy, the French contender Dailymotion keeps resisting. To differentiate themselves, both offer original and unique services.
YouTube: The Beast
YouTube was created in 2005 by former PayPal employees. The site hosts all kinds of video: movies, music videos, TV shows and more, using Flash technology. YouTube works by registration even if all Internet users can view shared videos. After you registered on Youtube, you can post your videos (up to 10 minutes in length), comment and rate others’ videos or subscribe to a channel. The site quickly found its audience through word of mouth. One of YouTube’s big projects has been to put online all the music videos produced by Sony BMG Music Entertainment, Warner Music, and Universal. YouTube has quickly become the number one platform for artists to do their promotion. A video viewed many times ensures recognition in the media. YouTube has enabled the discovery of many artists.
The site reigns supreme over the world of online video, especially since Google bought it in 2006. Android phones have integrated a simple and fast link to YouTube. In 2010, the video host reached two billion videos viewed daily.
Dailymotion: The Outsider
Dailymotion was created only one month after YouTube in March 2005. Despite sounding English, Dailymotion is a French company. It is very similar to YouTube, with the only difference that it hosts videos internally. Users registered on this French video-sharing website are called MotionMakers. They can send a video they have made to the editorial team of the site so that it is highlighted on the first page. From the beginning, Dailymotion has been supported by individual investors. In 2006, the site received fundraising of 8 million $ thanks to two investors. Dailymotion signs contracts with Universal and Warner but also with independent producers to have broadcasting rights. Dailymotion is fast becoming one of the world’s most visited video-sharing platform. It ranks 29th among the most visited sites with 114 million visitors worldwide. Like YouTube, the video host Dailymotion has revealed many talents.
What about other video-sharing sites?
If YouTube and Dailymotion have a clear monopoly in the video hosting game, there are a few other sites that are worth having a look at. Vimeo is an American video hosting company created in 2004. Although Vimeo was built before the Youtube, it is not as popular. The site was launched by filmmakers and other film professionals to share their work. In 2010, this video host had more than 3 million members. It offers a paid service to have access to better quality videos and pages without advertising. The content of the site is monitored to offer only original videos. All commercial, pornographic or violent videos are automatically deleted by this video host who wishes to preserve the state of mind at the origin of its creation.