Figuring out Windows FirewallI ran into an issue last week when I was working on upgrading a SQL2014 Availability Group to SQL2019. Nothing went wrong with the actual SQL upgrade, however, post-upgrade I ran into some headaches. Turns out Windows Firewall can bite you in the ass on an upgrade. The reason for the upgrade in the first place was so we could take advantage of automatic database seeding.

Automatic seeding is to be the lynchpin on PowerShell script I’m working on and will post about later. The idea will be to check for new databases added during the day, and then programmatically added them to the secondary later in the day. But that post is for another day.

After the upgrade, I performed a failover test, verified everything was working okay. After failing back over to the original primary, I added a database to the availability group via the wizard and chose automatic seeding, and while that was running I went off to do something else. When I got back, I saw the wizard had completed successfully and the database had been added. Just to make sure though, I checked the secondary and found no database. Say what?

After spending a day looking for failover cluster errors, testing failovers, adding databases to both nodes of the AG to test database creation, and assorted other tasks, I was still no closer to figuring out why the secondaries wouldn’t sync up or why automatic seeding wasn’t working. Keep in mind, all of this worked just fine when the AG was still running as a SQL 2014 environment. I mentioned this to one of our IT folks because I was at loss.

The fix

One of our IT admins found the problem, although it did take them a couple of hours. Turns out, port 5022 was blocked on the secondary replica but why was it blocked when it worked before? I’ll tell you why. When, and this very well could have been me when I built the server originally, the server was built and everything was loaded, a rule was created in the Windows Firewall for inbound connections. Only, instead of creating the rule to use the port number, the rule was created to allow access to the SSMS executable. Do you see what happened?

When I upgraded SQL to 2019, the 2019 executable took over running SQL. The path changed from <disk>:\<install location>\MSSQL12.MSSQLSERVER\blah blah blah to <disk>:\<install location>\MSSQL15.MSSQLSERVER\blah blah blah

The fix was simple, we changed the firewall to allow connections on the AG port. I mean I guess we could have repointed the rule, but then if we ever upgraded that server again…I promise we would forget about this issue and had to repeat the troubleshooting work figuring it out again.

Be humble folks, every so often we manage to shoot ourselves in the foot.

One of the hardest things to do when looking to start out in the SQL world is to find quality sources to learn from, or just not get overwhelmed with information. To help alleviate that, I have assembled a list of resources that I use to continue learning and research problems I might encounter.

What follows is a list of places to go. I also will periodically update this list as I find new sources, or old sources drop off.

Web sites

SQL Server Central  – Has Stairways, articles, and other various resources to learn Microsoft SQL

PASS – Microsoft Professional Association of SQL Server professionals – free to join and do monthly training webinars that are free. https://www.pass.org/

SQL Saturday – Local SQL events that provide free SQL training one Saturday a year. https://sqlsaturday.com/ – It’s also a good networking event.

Twitter – Microsoft SQL pros are very active on twitter. Check out the #SQLHelp to get a feel for problems. It’s also a good way to meet other SQL folks.

SQL slack channel – How to join the SQL slack channel – Oh and another good way to meet SQL folks.

TSQL Tuesdays is a monthly Blog party based on topics of interest to SQL professionals. – http://tsqltuesday.com/

Bloggers

There are a number of bloggers to follow. Most of the resources are free, but there are a few that have paid materials

Blog

To help build your IT credentials it’s helpful to start a professional blog to document what you learn, showcase your work, and help others with similar situations. You don’t have to pay someone to do or invest a lot of money into it. You can start off with free blog website services to get your feet wet.

Podcasts

Podcasts are a good place to learn about the SQL profession. Although sometimes it can be overwhelming and tough to follow, it still helps to listen. Sometimes the topics will trigger “aha” moments that lead to a small epiphany.

Learning about proper development isn’t such a bad idea either. You can pick up a lot of information about SQL and what is needed by the devs from the SQL perspective.

Soft skills are important as well. I recommend checking out the following podcasts to help in that arena.

Other areas to explore

Other areas to explore would be to pick up Powershell and Python.

Log File

A couple of weeks ago I was given the task of helping a developer figure out why his Postgresql cluster wouldn’t fail over. The specific problem was to test failover prior to launching the Postgresql cluster in to Production. Sounds easy enough,, right? Setting things up wasn’t too bad, but finding answers quick was a challenge.

Setting up

The last thing I wanted to do was to start potentially destructive testing on a cluster that was to be used in Production. Instead I opted to create a couple Postgresql instances that I could set replication up on. Without thinking much, I set up instances on my laptop and ran them under Windows. I got to thinking about that, and realized that wasn’t going to reproduce the same results as on existing Linux cluster. So I spun up a couple of Ubuntu VMs.

Once getting the VMs spun up on my laptop, I had to install Postgresql, which wasn’t that difficult. In fact, it was a pretty simple process of making sure Ubuntu was updated, installing Postgresql per the documentation, and then configuring the HBA file to allow me to connect. The challenge came in setting up Replication for Postgresql. However, once setup it was pretty simple to get replication configured.

Getting the fail over to work

The way Postgresql handles replication is counter intuitive to the way SQL Server is done. Of course, to be fair I’m sure a Postgresql DBA would think the same about SQL. The Master-Slave relationship is just that,  Master telling the Slave what to do. If the Master goes down, the Slave is available for read. If the Slave goes down, the Master keeps going. In this relationship, the Master is the only write node. So when you promote the Slave to Master, the previous Master is orphaned. To rejoin the cluster the previous Master must be reconfigured as a Slave. Good times there. No failing over in a back and forth fashion.

My issue was with the promotion step kept failing. I would get this jacked error that:

pg_ctl: server did not promote in time

You have to love Linux for it’s vagueness. Turns out, it was pretty easy fix. When I was following the setup guide, I had copied in a recovery.conf file. Only, I didn’t modify the owner of the file. So had I taken more than 2 seconds to remind myself to check the log file, I would have discovered this problem much sooner. As it turns out, I made a post about this on DBA Stackexchange and was reminded to check the log file. In my defense, my Linux is a little rusty but that doesn’t excuse the fact I know to always check the log.

Turns out, changing the owner of the recovery.conf file made the promotion work flawlessly. I should know better. Always check the log file.

This week I got asked to help a developer figure out why a failover wasn’t working on a Postgres cluster. Interesting enough I guess. Especially because I don’t know anything about Postgres. Good time to learn I guess? I don’t know. Anyway, I accepted the directive and started trying to get familiar with the new RDBMS system. The dev sent me a username and password to work with, so I got to work trying to figure out the issue.

The Work

Starting off, was a failure in communications. I was sent a username and password, and naturally assumed this was as postgres user. So I loaded up PgAdmin and tried to go to work. Ya, no that wasn’t happening. It turns out, the username and password given to me was for a Linux account. A Linux account without the ability to sudo to root (or anyone else). So after going in circles for the morning, I finally got my access straightened out.

Moving on, I setup a user for me in Postgres, modified the necessary conf files to give myself to login into Postgres with PgAdmin from my laptop. Sadly, between the Googling and continued back and forth over my access, this took the better part of the day. Once in though, it didn’t take me to long to figure out the basic lay of the land. I guess when you get down to it, an RDBMS is a RDBMS is an RDBMS.

Failover

This is where the process takes a turn. In SQL, when you think of failover replication with Availability Groups you know you can bounce the nodes pretty easily. You can make any member node of the AG the primary replica without much work. Makes sense. No so in Postgres/

In Postgres native replication, to failover you have to shut the primary node down and then manually (or through a trigger file….*still not sure about this) promote the standby server to master. Then, when you bring the previous master back online, it has to be brought up and configured as a standby server. At least, that is my understanding of the process so far. It seems a little jerky to me to have to jump through these kinds of hoops to have a failover configured. Then again, there is probably some framework or script that handles failovers better. I’m still working on understanding if this is actually how Postgres handles the failover.

Still learning….

#tsql2sdayThe 101st addition to the SQL monthly blog party is being hosted by Jens Vestergaard (b/t). For the topic this month, Jens has asked which tools are essential to our daily job. Or rather, which tools do I depend on as a DBA.

Interesting Topic

I would love to say I noticed the invitation last week and had an idea already in place. I would like to say that, but  I would be lying. In a moment of serendipity,.I’m testing my individual DR solution today How effectively can I work from a borrowed workstation. You’re probably sitting there thinking, “Wow, someone out there actually tests how to work from a borrowed machine? That’s hard core!”. While, I would suggest asking yourself what would you do if your main workstation become unavailable while you’re on call and have a plan in place, this isn’t that. I simply was to preoccupied this morning to realize I had forgotten my laptop bag.

So today, I’m thinking about how much I miss my work environment Things like:

  • NotePad++ that I use as a text editor and scratch paper.
  • My local script repo
  • Reference docs/ebooks I keep on the hard drive
  • SQL prompt
  • SQL Search

I also miss how I have various apps configured for my personal preference. But I’m also thankful that I have cloud tools like OneNote and LastPass that make this more bearable. Signing into Mozilla got me access to my bookmarks.

What this is teaching me

I am quickly realizing having my documentation/ebooks and scripts available in some kind of repository would be greatly beneficial (Why yes, I am embarrassed I haven’t put those scripts in a repository). It is also making me realize how dependent I am on a laptop that could fail or be stolen, and just how disruptive it is. So I will be working on a plan to resolve that.

I am also going to make sure I place my laptop bag in a more obvious spot before I walk out the door.