Just today I was talking to my adviser and I mentioned how I have 10 GB of memory on my afs space and his response was, "I don't know why they give you so little space, I mean who uses Gigabytes any more?" This is a particularly true sentiment considering the fact that this morning I ran a test code (just to see if I had made any errors) and in the 3 and 1/2 minuets that it ran it made 3.2 GB of data (in a total of 8 output files, so that's 6 files of 320 MB and two of 640 MB). That means that if I kept that up I would be out of memory in less than 11 minutes (it would actually take me about 20 minutes, but that's just a technicality). Of this data I only took one file to use and the rest I just trashed. On a serious simulation I am looking at having 1-2 TB of data, so my threshold for what makes something "a lot of data" is quite high.
So to come back to my original topic, my data range sits somewhere around 100 MB all the way up to several TB, with the MB range being the very low end. So my questions are, what is your data range? What size of files do you typically use? Or when do you start to say, "Wow that's a lot of memory."? I just want to get a sense of the size of files, or just the amount of memory people consider to be "sufficient".
For me I start to say "Wow that's a lot of memory." somewhere around 5 TB. A normal file size fore me is several MB if not more, and a typical amount of data that I use and move about is usually several GB. How about you?
For me I start to say "Wow that's a lot of memory." somewhere around 5 TB. A normal file size fore me is several MB if not more, and a typical amount of data that I use and move about is usually several GB. How about you?
My data range from a few kB to a couple hundred GB, depending on the data. Typically, I deal with 100 MB - 1 GB just because of RAM limitations in manipulating the data. The particular data I am refering to cannot be used on a supercomputer and I don't have easy access to a wrokstation with >100 GB of RAM so I usually reduce the complexity of my data.
ReplyDeleteI use data sets range from several MB to several GB and often I need to use many at the same time so I can't get my work done unless I have a machine with a minimum of 32 GB or RAM.
ReplyDeletePart of my job is to do maintenance on data in three large data centers. These data centers receive a few k of data every time someone clicks on our customer web sites. We have a lot of customers. We handle about 1 trillion hits per quarter and we keep two years or more of customer data. In total there are several Peta bytes of data spread across about 20,000 servers in the 3 data centers. File sizes are on the gigabyte scale and I do maintenance on data sets over a Tera byte in size at a time. Some jobs I launch may take a week or more to finish.
ReplyDeleteEven my data is in the Terabyte range and I am dealing in names and images not the universe.
ReplyDeleteI admit, most of my experiments don't require a ton of data-taking. The most data-intensive measurements I make are things like characterizing a gain medium crystal or trying to predict and measure the output profile of a new laser. My computer has a total of about 100 GB of storage space, of which I've only used about 50 GB. I can currently backup all my pertinent data with the 2 GB free from Mozy.
ReplyDeleteWhen I worked at LLNL, the NIF generated around 10 TB of diagnostic data with every shot. That info was transferred to the BlueGene supercomputer in the next building over for processing.
ReplyDeleteDuring my MS, I worked a lot with computer vision, so I had many large videos. Processing that data would often result in several GB of data for review, often in the form of other movies and image.
Now, however, as a control engineer, my data sets are pretty dang small (on the KB level) so most of the space on my current HD is filled up by my iTunes library :-) .
Right now, I've got dark matter only simulatiosn using about 800GB, and 3 simulations with baryons using about 500GB each.
ReplyDeleteFor my postdoc (assuming I get one...), I plan on creating about 10 TB of data, maybe a bit more. What's a MB again? Oh yeah, that tiny little speck of a tiny portion of my data.