Docker Hugo

August 3, 2018 | Theo

After restarting my blog, I wanted a way to automate my workflow. I currently work for AWS, and want to use the features of the cloud to manage and deploy my blog, but for as little cost as possible. The lowest cost for a static site like mine is Amazon S3, which offers to host the objects in the bucket as a static website.

This starts by adopting a solid framework for building static sites. After trying a few, I selected Hugo. I had been using mkdocs for training/tutorials but felt it lacked a good native layout engine and wasn’t a good fit for a blog.

I followed the installation instructions, but wanted something I could containerize (since it’s relevant to my current work). Thus, I created docker-hugo as a simple project to containerize hugo.

For now, this includes a README and a Dockerfile (copied as of August 3, 2018):

FROM centos:latest as builder

RUN yum -y update
RUN curl -sL -o hugo.tar.gz https://github.com/gohugoio/hugo/releases/download/v0.46/hugo_0.46_Linux-64bit.tar.gz && tar zxf hugo.tar.gz hugo

FROM scratch

COPY --from=builder /hugo .

VOLUME /host
WORKDIR /host
EXPOSE 1313

ENTRYPOINT ["/hugo"]

While a simple example, it does combine some newer Docker features. I used a multi-stage build to download the actual binary, then a scratch image for the actual deployment. The README highlights the syntax I use for the command, and an alias for being able to run hugo new posts/docker-hugo.md with all of my environment variables already plugged in. This can also be adapted for a future CI/CD process.

Add Athena Partition for ELB Access Logs

July 31, 2018 | Theo

If you’ve worked on a load balancer, then at some point you’ve been witness to the load balancer taking the blame for an application problem (like a rite of passage). This used to be difficult to exonerate, but with AWS Elastic Load Balancing you can capture Access Logs (Classic and Application only) and very quickly identify whether the load balancer contributed to the problem.

Much like any log analysis, the volume of logs and frequency of access are key to identify the best log analysis solution. If you have a large store of logs but infrequently access them, then a low-cost option is Amazon Athena. Athena enables you to run SQL-based queries against your data in S3 without an ETL process. The data is durable and you only pay for the volume of data scanned per query. AWS also includes documentation and templates for querying Classic Load Balancer logs and Application Load Balancer logs.

This is a great model, but with a potential flaw–as the data set grows in size, the queries become slower and more expensive. To remediate, Amazon Athena allows you to partition your data. This restricts the amount of data scanned, thus lowering costs and increasing speed of the query.

ELB Access Logs store the logs in S3 using the following format:

s3://bucket[/prefix]/AWSLogs/{{AccountId}}}}/elasticloadbalancing/{{region}}/{{yyyy}}/{{mm}}/{{dd}}/{{AccountId}}_elasticloadbalancing_{{region}}_{{load-balancer-name}}_{{end-time}}_{{ip-address}}_{{random-string}}.log

Since the prefix does not pre-define partitions, the partitions must be created manually. Instead of creating partitions ad-hoc, create a CloudWatch Scheduled Event that runs daily targeted at a Lambda function that adds the partition. To simplify the process, I created buzzsurfr/athena-add-partition.

This project is both the Lambda function code and a CloudFormation template to deploy the Lambda function and the CloudWatch Scheduled Event. Logs are sent from the Load Balancer into a S3 bucket. Daily, the CloudWatch Scheduled Event will invoke the Lambda function to add a partition to the Athena table.

Using the partitions requires modifying the SQL query used in the Athena console. Consider the basic query to return all records: SELECT * FROM logs.elb_logs. Add/append to a WHERE clause including the partition keys with values. For example, to query only the records for July 31, 2018, run:

SELECT *
FROM logs.elb_logs
WHERE
  (
    year = '2018' AND
    month = '07' AND
    day = '31'
  )

This query with partitions enabled restricts Athena to only scanning

s3://bucket/prefix/AWSLogs/{{AccountId}}/elasticloadbalancing/{{region}}/2018/07/31/

instead of

s3://bucket/prefix/AWSLogs/{{AccountId}}/elasticloadbalancing/{{region}}/

resulting in a significant reduction in cost and processing time.

Using partitions also makes it easier to enable other Storage Classes like Infrequent Access, where you pay less to store but pay more to access. Without partitions, every query would scan the bucket/prefix and potentially cost more due to the access cost for objects with Infrequent Access storage class.

This model can be applied to other logs stored in S3 that do not have pre-defined partitions, such as CloudTrail logs, CloudFront logs, or for other applications that export logs to S3, but don’t allow modifications to the organizational structure.

Blog Restart

July 27, 2018 | Theo

It’s been over 10 years since I had a blog, or at least maintained one. I want to promote my personal brand but have often not put forth the effort. I have a significant amount of experience, so it’s just a matter of putting my experiences down “on paper”…and having the right tool to publish.

Enter Hugo. I’ve been a fan of Markdown for awhile, and make avid use of it for projects on GitHub or written for mkdocs. I wanted something that could deploy to a static site since my actual code rarely changes, and to save overall costs. My current site is built on GitHub Pages, but does not allow me the necessary capabilities, and I wanted something similar to mkdocs but that I could easily deploy a scaffold and work.

While I often travel with a laptop, I’ve also been looking at my mobile productivity, and I feel that I could accomplish more by using my mobile device. When I have an idea, I want to commit quickly. My tablet is an easy way to do so since it takes up less room and has less time to boot, but has lacked a sufficient productivity tool.

I’ve typed up this post partially using Working Copy. For me, it has the right blend of git integration and file editor (with Markdown syntax highlighting). You can’t push without the in-app purchase, but the free version plus a 10-day trial lets you test before buying, which let me make sure it fits my workflow.

For my blog content, I plan to document my experiences through my IT journey in hopes that it will also help others. I’ve always embraced the IT community, and a blog is my latest way of giving back. I’ve always struggled with trying to get the best structure and methods before pushing something new, and that’s always led to me never launching. This time, I’m accepting that the blog may not be perfect, but it’s out there and functional. I’ll be able to make improvements over time and grow this into a resource for all.

Moved to theodorejsalvo.com

June 2, 2010 | Theo

Sorry for the confusion, but shortly after opening this blog I decided it was time to get my own domain name/hosting, and put WordPress.org on that.

This blog is now outdated.

Please visit www.theodorejsalvo.com

Thanks!

Update: Internet Safety and Privacy Concerns

April 14, 2010 | Theo

As a follow-up to my post Internet Safety and Privacy Concerns…are you protected? I found a post on Google’s Security blog talking about Fake Antivirus programs, or FakeAV, which talks about how real of a threat this is, and how we seem to be doing better at handling it.

The post can be found at http://googleonlinesecurity.blogspot.com/2010/04/rise-of-fake-anti-virus.html.

Review: Dell Mini 10v

April 9, 2010 | Theo

I needed a computer I could take with me. My last laptop was an HP tx1000 Convertible that I’ve given to my father. That laptop was used for some personal, but mostly business. My next laptop was going to be for personal only. I had seen the cheesy Dell “candy” commercials (got the song stuck in my head and everything!) and liked the idea of a netbook. Sacrifice an optical drive for a lightweight ultraportable? Yes, please.

My primary functions on a computer are internet, email, programming, and IT scripting–all of which use very little computing power. I’m a command/keyboard-oriented user–I’d prefer to type a command in vs. a GUI and I don’t need a mouse if I don’t want one.

Configuration

Original Configuration (My Configuration)
Processor:	Intel Atom N270 1.6 GHz 533 MHz, 512 MB L2 Cache
Display:	10.1″ Widescreen Display (1024×600 Native)
Memory:	1 GB DDR2 SDRAM
Graphics Chipset:	Intel GMA 950
Hard Drive:	160 GB, 2.5″, 5400 RPM SATA 16 GB SSD
Battery:	24 Wh (3-cell) Lithium-Ion
Wireless Card:	802.11g (1397) 802.11n (1510)
Webcam:	Integrated 1.3 MP
Bluetooth:	None Integrated 2.1 w/ EDR
Operating System:	Windows XP SP3 Ubuntu 8.04 LTS

Pre-Sales
I researched this netbook for two weeks comparing brands. Dell was always the top choice because of past experiences I have had with the company. The larger debate was between the 10 and 10v, and I settled on the 10v because of issues with Linux and the Graphics Chipset that comes with the 10.

I like the ability to connect to whatever I need to, thus 802.11n wireless, Bluetooth, and the webcam were all required.

The hard drive was also a debate. At the time there were very few reputable SSD’s out there, so it was hard to see if the performance was worth losing the space. Ultimately, it was a “try the new latest-and-greatest” decision.

I’ve always used Windows because of my reliance on proprietary software. For the purposes of this computer, I do not need any proprietary software, and my linux/unix skills are rusty. Thus, I opted for Ubuntu Linux as the operating system.

The Good
This netbook is LIGHT! It weighs less than my desktop keyboard. I don’t feel the weight of it when I’m reclined in my office chair. I can pick it up and move it around with one hand WITH EASE!

It is quiet. The form factor + the SSD means no fans, no moving parts. Makes a big difference when you’re trying to think.

It’s fast. Based on the specs of the processor, and having only 1 GB of RAM I was expecting it to act a little sluggish. The operating system helps, since with this computer I moved from a Windows computer. If I were going to run Windows on this machine I’d opt for the 2 GB, especially now with Windows 7. I constantly run Firefox, Empathy, Evolution, and a Terminal. With these open I’ve opened one or two more programs up, but real estate on the screen is precious, and I typically don’t need anything else open at once. I’d bet it could do more though.

When I need/want to, I can use the heavier bats. Terminal Services/Remote Desktop work fine (as long as my wireless router isn’t acting up, and being wireless in the first place doesn’t help). I also installed VirtualBox and had an Ubuntu VM running with very little performance loss. Packet sniffers and network crawlers have no problem. OpenOffice runs just as fast as on my desktop. I was able to take a Sprint USB AirCard, click Connect, and be connected WITHOUT Sprint’s software.

The smaller battery lasts 2½ hours on constant normal use. The AC adapter packs up nicely, and is easy to tote around, should I actually use it that long.

The Bad
No Ubuntu Key: Ubuntu was factory default from Dell, but the Windows key is still on there. @Dell: when you make the super key with the Ubuntu logo, or even Tux, make sure and send me one. For now, thank you System76! Get your own Ubuntu stickers!

Mousepad is sensitive: because of the design, click-dragging is nearly impossible. I know this is a software issue because the Mini’s I observed with Windows ignored the bottom part of the pad where the buttons reside. It takes some getting used to. However, I still prefer this layout than the mouse buttons clearly on the left & right sides of the pad.

Hard to Modify: Aside from the battery removal, the only compartment on the underside is for the mini-PCI slot, which has my WiFi card in it. To get to everything else including RAM is the equivalent of performing thorassic surgery.

Preloaded Dell software: Pet peeve of mine, but I’d like to choose my own favorite search engine, instead of having 4 links to a certain unmentioned one. I also couldn’t upgrade to the latest stable build of Ubuntu without uninstalling Dell’s copy of Ubuntu and using Canonical’s package. I ended up with Ubuntu Netbook Remix 9.10, and I’ll be upgrading when stable 10.04 is released.

Screen could be bigger: I am glad the keyboard got bigger from the Mini 9 to the Mini 10, but I feel that with today’s technology, we don’t need a 1″ plastic border around the screen.

Can’t secure RGB cable in place…no screw holes on the side of the port.

AC Adapter required the bottom plug on a single-gang outlet box. Plays well on other sides though.

Thoughts
When Moblin gets further along, this will make a great machine to test it on. Same with Chrome OS.

The WiFi card is a half-mini PCI, and Dell moved the connector over to accomodate. Was that necessary?

If Dell is going to bundle Ubuntu with hardware, then I’d like to see some more involvement from them in the community. I’m not saying you’re not there…I just don’t see you.

Conclusion
This is a great buy for the money I paid. It’s not top of the line, but it will do what I need it to and then some. I don’t know if I would combine this configuration with Windows, but for Ubuntu I’d highly recommend it.

Tutorial: IP Addresses and Subnetting

January 29, 2010 | Theo

When you connect your computer to the internet, your computer is given an Internet Protocol (IP) address. If you check your connection settings, you will commonly see two similar values; one is called a subnet and the other a gateway. Have you ever wondered what these three actually mean and what they do?

An IP address is like your address at your home. Your home address tells other people where to find you when you’re home, and your IP address tells other computers where to find your computer. The most common form of an IP address is called dotted decimal notation, which has four sets of numbers, each ranging from 0 to 255, separated by a period. 192.168.0.1 is a common example, and I’ll use it throughout this tutorial, so write it down!

Why 0 to 255? It is the range of an octet, or 8-bit number. As you’ve heard before, computers speak in binary, which is just 0’s and 1’s, on and off, etc. Each bit can have a 0 or a 1, so the range 0 to 255 is really 00000000 to 11111111 in binary. There are four octals in an IP address, which is a fancy way of saying an IP address is a 32-bit number.

Which of these numbers is easiest to understand?

192.168.0.1 – Dotted Decimal Notation using octals
11000000101010000000000000000001 – Binary
11000000.10101000.00000000.00000001 – Dotted Binary
3232235521 – Decimal Notation

They are all the same! But #1 has proven itself to be the easiest to use, especially when you have subnetworks.

Subnetworks? You mean there’s more than one??? The reason relates to divide-and-conquer. One person trying to manage 4294967296 computers would be tasking, to say the least. Hence we can use subnetworks, or subnets, so that one computer is not responsible for the whole network. With the hardware and software out today, it’s easy for a device to manage it’s own subnet. But which one are you using? How big of a subnet are you in?

The answer is right there in your subnet, sometimes called a subnet mask. This tells us what subnet we are a part of, and how big the subnet is. To make things easier, there are three defined subnets:

Class A – 255.0.0.0
Class B – 255.255.0.0
Class C – 255.255.255.0

For our example, let’s say that our IP address, 192.168.0.1, is in a Class C subnet, 255.255.255.0. This means that 192.168.0.1 is in a subnet with 256 addresses ranging from 192.168.0.0 to 192.168.0.255. (“How I did that” is coming up…)

An IP address is actually a combination of two IDs, a network ID and a host ID. Notice in the range above, the 192.168.0 stayed constant while the last octet changed. Simply put, the constant 192.168.0 is the network ID and the last octet is the host ID. This is important so your computer knows what subnet you are on.

By now you’ve noticed that our subnet mask, 255.255.255.0 has no digits in common with our IP or our range of IPs–so why is it important? Our subnet mask is the dividing line between network and host! Let’s look at our IP and subnet mask:

IP:     192.168.  0.  1
Subnet: 255.255.255.  0

That vaguely shows me that the first three octets are the network, but you’ve seen subnets that don’t have 255 or 0 in all of the slots.

What does binary show us?

IP:     11000000.10101000.00000000.00000001
Subnet: 11111111.11111111.11111111.00000000

There it is! Every bit in the subnet that’s a 1 makes that bit in the IP part of the network. Every bit in the subnet that’s a 0 makes that bit in the IP part of the host.

Let’s take a look at a different subnet mask: 255.255.255.128. This isn’t one of our three “class” subnets, and thus it is called a classless subnet. Again, in binary:

IP:     11000000.10101000.00000000.00000001
Subnet: 11111111.11111111.11111111.10000000

Not a big change, but what does that do to our network? First, you now only have 2^7, or 128, hosts in your subnet instead of 256. And now your network ID has changed…

With our new subnet, is 192.168.0.127 in our subnet?

Subnet: 11111111.11111111.11111111.10000000
IP:     11000000.10101000.00000000.00000001 <-- 192.168.0.1
IP:     11000000.10101000.00000000.01111111 <-- 192.168.0.127
        ←          Network         →← Host→

Both IPs have the same Network ID, thus they are in the same subnet.

What about 192.168.0.128?

Subnet: 11111111.11111111.11111111.10000000
IP:     11000000.10101000.00000000.00000001 <-- 192.168.0.1
IP:     11000000.10101000.00000000.10000000 <-- 192.168.0.128
        ←          Network         →← Host→

These have different Network IDs, so they are in different subnets.

What if we go back to a Class C subnet?

Subnet: 11111111.11111111.11111111.00000000
IP:     11000000.10101000.00000000.00000001 <-- 192.168.0.1
IP:     11000000.10101000.00000000.10000000 <-- 192.168.0.128
        ←          Network        →← Host →

Now they are in the same subnet!

Before we noted that dotted decimal notation is preferred for writing IP addresses because it’s the easiest to remember. If you have a Class C subnet and are responsible for assigning IP addresses to computers, you know that [in our example] the first three octets will always be 192.168.0! Much easier!

There’s also a shorthand for writing subnets. Just put a slash (” / “) after the IP address, and then write how many 1’s are in your binary form of subnet. For example, with a Class C subnet our shorthand would read 192.168.0.1/24, and our classless example would be 192.168.0.1/25.

In your network settings, there’s also a gateway. How does it relate? The gateway is an IP address INSIDE your subnet which connects you to the OUTSIDE of your subnet. If you try to send a packet to a computer and it’s IP address isn’t in your subnet, then your computer sends it to your gateway and the gateway passes the message on.

I know how you feel. When I learned about subnetting, the only thing I could think of is, “how will this be useful?” I then went into work and looked at my company’s IP address, subnet, and gateway (I changed the company IP to protect my company, sorry):

IP:     192.168. 25.174
Subnet: 255.255.255.252
Gateway:192.168. 25.173

Back to binary:

IP:     11000000.10101000.00011001.10101110
Subnet: 11111111.11111111.11111111.11111100
Gateway:11000000.10101000.00011001.10101101

Let’s look at the subnet first. There are only two bits for hosts IDs (and 00 and 11 are reserved), which leaves exactly TWO IP addresses in this subnet. This means that my computer is on a network BY ITSELF, and the other is used as the gateway! This does not work for every case but when I have just one computer I don’t want anyone else inside my subnet!

And my IP and subnet are easy to remember because I write it as 192.168.25.174/30!

Internet Safety and Privacy Concerns…are you protected?

January 21, 2010 | Theo

Over my tenure as a “computer guy” I’ve been asked to “fix” a computer countless times. I remember it started out as “my computer won’t print” or “the screen went blank”, but lately I’ve heard a different phrase… “the internet is down!” While this may be true, it is also proving to be safer. Recently I’ve helped family, friends, clients, etc. to get rid of Antivirus spoofs, or fake programs that look legitimate, but in reality are a virus, a phishing scam, or worse!

It starts with an unknown website or email attachment (I even saw one come in from an internet ad!) which puts the virus on your computer. From there, it quietly works its way into the computer disabling the “easy way” for me to fix it, then finally pops up a window saying “You are infected!” The popup looks like a windows program and most of the time doesn’t have a brand name so it just says “Antivirus 2009” etc. If I’m lucky, I get a call when they see this screen. Most of the time, I’m not that lucky…

Should you let it scan your computer, it starts listing important files as infected and says that you can pay some amount to buy the program and it will clean the viruses for you. At which point they ask you for a credit card number, and then you are toast! It was all a ploy to steal your credit card & identity to do who knows what with them!

Granted there are fixes, patches, and repair tools to these problems. However, these attacks are better off being prevented than treated, and all it takes is an antivirus and/or antispyware program, some are even FREE.

So why hasn’t every computer user without antivirus downloaded the free antivirus tool? Because the average user doesn’t think that they will get a virus, which is the same as the average biker not wearing a helmet because they don’t think they’ll crash.

We are not as safe on the internet as we think we are!

Case in Point: AT&T had a glitch the other day where a customer was on their phone going to Facebook to login, and they were automatically logged in…TO SOMEONE ELSE’S ACCOUNT! Apparently, the servers were confused on who was who, and it sent the wrong information to the wrong phone. I don’t think anyone is to blame, but this just proves that what you think is safe isn’t really safe. (Read the AP story here.)

What you think is private isn’t really private. You can put the pictures of you partying on MySpace, but just remember that a court order will put those pictures in the hands of the prosecutor. If you don’t believe me, read here.

Now an antivirus/antispyware program won’t prevent the outcomes of the past two points, but that doesn’t mean you shouldn’t protect your computer. Without these programs, anyone could take over your phone and make calls or hack into your personal computer and download your pictures. If you buy a brand new 65″ Plasma TV, you’d get the protection plan. Even if it’s free, you’d still want it. So protect your computer!

Theodore J. Salvo

Working on tech no one else wants…

Author: Theo