What is the best laptop for a data scientist?

I asked myself exactly the same question about a year ago, and came up with a solution which now works amazingly, learnt a lot, and saved myself a heap of money in the process.

I already had built my ideal (i.e. powerful) data analytics computer about a year prior, but it was a desktop. I figured I could actually just buy a really cheap laptop, keep my desktop running all the time, and then use RDP*, Teamviewer*, or a VNC* programme to connect to it whenever I needed to do some data analysis.

What is the best laptop for data Science ?

I bought the cheap laptop (AU$350, 11" touchscreen display, windows 8, HP net book thing), and started trying to set up VNC. I got it working, but it meant that I had to always leave my desktop running, and it was fairly laggy.

I later discovered Amazon AWS EC2, a service which let's you create virtual computers with any operating system you want and customise how you access them. I set up one of these (Linux), then taught myself how to use Linux.

The most useful thing about it is that I have installed a web based IDE for R on it (Rstudio), which allows me to go to a website hosted by my EC2 server and use R as if I was sitting at that computer. Now, whenever I want to do some work, I can do it from any computer in the world with an internet connection, simply by visiting a website, and all the processing is done on the Amazon server.

You have to pay for the server, but they are inexpensive, and you pay different amounts based on the (virtual) processor, RAM, GPU etc of the server. Also, there is a 1yr free trial which let's you use the least powerful virtual server at no cost.

I understand that R may not be the only language you wish to use, but given that you can install anything you want on your server, it seems like a viable option.

Advantages:

  • Can access server from any device with the internet
  • Files are always accessible. Don't even need to download them (like you would with drop box), just view on the server
  • Costs much less than powerful laptop
  • Server can be programmatically designed to scale depending on analysis needs using an API

Disadvantages

  • Laptop screen is quite small, but I now find I access the server mostly from other desktops
  • Requires internet connection to use
  • Can take some time to learn how to use EC2

* These programmes all let you view and control one computer from a second computer, over the 
internet.



What is the best laptop for a data scientist?


As a minimum, I would buy a machine with an i7 CPU, 16GB RAM and SSD.

A 64bit CPU such as the i7 is important as it will process large variables twice as fast as a 32bit architecture. This is because large numbers use 64bits need to be split in two for 32bit systems and thus uses twice the number of processes to handle them. 64bit systems are also able to utilize more RAM.

i7s also have more cache and more cores than i5s and i3s which come in handy for data processing. Keep in mind though that tools such as R and Python are single-threaded by default so will only make use of multi-cores if you use libraries that take advantage of them. R data.table, for example, is a high-performance multi-threaded data frame that is capable of utilizing all of your cores so performs tasks dozens of times faster than typical R or Pandas data frames which will only take advantage of a single core.

16GB RAM is important because datasets are processed in memory. As soon as your dataset exceeds the size of RAM your system will either slow down thousands of times to swap data in and out of disk or stop working altogether.

SSD is important because if you are reading and writing large datasets to disk it will be around 40x faster with SSD than traditional hard drives.

With this type of setup, I can easily handle a 10 Million record data frame in R which fits my type of work just fine.

This is also a stock standard high-end laptop which is still nice and light to carry.

I personally use a Lenovo X1 Carbon.

You could really purchase any computer with similar specs however I do think that certain brands are more reliable (because of better driver support and updates) than others and have less bloatware.

You could get more powerful laptops but they start to get a lot chunkier and bespoke as there’s only so much room for components (and cooling).

I generally don’t use deep learning algorithms that take advantage of GPU acceleration so that’s not something I would spend the extra money on.

GPU’s on laptops are not that common unless you buy an expensive, chunky gaming laptop.

The 14″ screen on the X1 carbon is really nice for the size of the laptop however, in general, you’ll want to plug in one or several external monitors.

I connect my laptop to a 4k TV as an external monitor which is like having 4x 1080p displays. Great for seeing a lot of code and data at the same time. Not so great for posture.

What about Macs?


Apple makes some really nice (albeit expensive) machines. The nice thing about OSX is that it’s built on top of Unix. The downside is that it doesn’t run a full version of Excel or a lot of other business applications such as various finance and trading platforms.

The full version of Excel is very useful as features such as Power Query and Power Pivot allow you to share large datasets through Excel through a reproducible process.

Windows now has good support for Linux and allows you to easily install various Linux distributions as a subsystem under windows.

You can download an install an Ubuntu Linux distribution from the Microsoft Store for free in just a few minutes.

You could, of course, run Windows on Mac but then that sort of defeats the point and ends up very very expensive.

If I was working on larger datasets I would either work on a desktop or the cloud.

The benefit of the cloud is that it’s much easier to scale up and down if you “infrequently” run big jobs because you’ll only pay for what you use.

If you’re getting into big data which requires the establishment of a cluster this will be far easier in the cloud as you won’t need to buy and configure several PCs.

Use your laptop as a thin client to remote to get access to practically unlimited resources.

I’ve set up RStudio on AWS which can then be remotely accessed through any web browser.

Jupyter notebooks are also a good option for Python users and to gain access to cloud machine learning libraries.

The downside of the cloud is that you can’t work offline.

Higher spec machines do start to cost a bit of money to rent and even though you can spin them up and down when you want to use them, that does become a bit of an unnecessary hassle when you’re working with less than 10 million records which can easily be crunched on a laptop.

A lot of the time you could easily keep the sample of your data less than 10 million records because frankly, it’s nicer working on your own machine if possible and just use the cloud to scale.

Smaller datasets can still make sense in the cloud if you’re working in a team and you don’t want to pull down data to laptops either because of bandwidth, security or synchronization issues.

If you’re working with between 10 million to 1 billion records on a “consistent” basis, pulling data is not an issue and your cloud hosting cost is starting to rack up a desktop may be a better option for you.

Hope that helps

Data scientists are highly innovative professionals with high-end data processing requirements. Unless the data scientist always users a remote server to get the job done, a powerful laptop will continue to be a requirement.

If you are a seasoned professional who is only looking for storage, RAM, and processing speeds can go for the ROG Strix Scar III or the Blade Pro 17 from Razer. These gadgets also come in handy if you are looking to use the concept of GPU acceleration for neural networking.

If you are strictly looking for a MacBook, nothing beats the efficacy of the new MacBook Pro 16. However, if you are a student or an aspiring professional with not-so intricate requirements, the Lenovo Yoga 730 is a decent enough device.

In the end, it all comes down to the criticality of your job and the nature of tools that you would be using for data visualization, data scripting, database management, and establishing connections with the data server.

Related Search Intant

  1. what laptop should i get for learning data science
  2. how to use laptop gpu for data science
  3. how much ram in a laptop for data science
  4. what laptop is recommended for data science project
  5. what laptop do you use for data science
  6. what type of laptop should i get for data science
  7. what is the best laptop for data science
  8. which laptop for data science
  9. how necessary is gaming laptop for data science


Comments

Popular posts from this blog

Starting a Photography Business Will Be Easier With These 30 Free Tips

Expert Advice To Make Music Downloads Easy!

Get A Great Deal On Your Music Downloads With These Tips And Tricks