With the invitation from AndyLeonard for the May month of T-SQL Tuesday, here are my thoughts on how to respond to technology changes.
As there is a saying, “When there is no Change, there is no Growth”
This is true in every aspect of life including technology. As the Technology changes, it is helping our lives to be more simpler and easier than before. With the continuous advancements in the technology, many professionals find themselves rushing in learning the new technologies. These advancements in the technologies adding up new features to previously existed technologies improve in performance and solve problems making our lives easier. The only question is how quickly can we cope with this change?
The only answer I believe is to train the employees with the new changes in the technologies. It is really important for the companies to not only bring the latest technologies to the companies for the use but also to mainly focus on the employee trainings to use the latest technologies efficiently.
Apart from the trainings, there are many employees who love and always look for the ways to learn new technologies. For every new technologies we have now, a free training platform may already be available to train the individuals so they have ease in using those technologies in their work. For example, Microsoft Azure. We do have a free training from Microsoft known as Microsoft learn. Microsoft also offers a free azure account so professionals can go ahead and get that hands on experience as they learn. This platform has helped many professionals across the world including me and helping in advancing our careers. Huge thanks to Microsoft for taking this initiative.
To keep ourselves updated with the latest evolving technologies, I follow several resources out there along with the technical documentation itself. There are several blogs related to specific technologies, communities helping each other to learn these new technologies and several technical newsletters and publications to keep us updated. I follow social media articles and posts from linkedin and twitter to be updated as much as possible with upcoming technologies.
Along with these, there are several youtube channels providing the concepts of the new technologies as they evolve. There are other training companies who can actually provide trainings especially from beginner to advance level along with hands on labs.
Keeping us up to date to know what is going on is not a tough job but investing time and interest in learning these technologies to effectively use in our daily work life is really important.
At the beginning, everything seems to be so confusing. If we see the number of technologies evolving at the same time, it is overwhelming for sure but if we can take one technology at a time and if we can focus on how the new technologies can actually simplify our daily work, we will be embracing the new technologies for the latest innovations they can offer.
Finally, it is very important to be ready for more and more advancements and to anticipate the changes. As the technology evolves, it becomes much easier to use. Keeping ourselves updated is the only way to keep growing!
With the invitation of Steve Jones for April month of T-SQL Tuesday, I am going to share some of my thoughts on using Jupyter notebooks.
Jupyter Notebook is an open-source web application that can serve us to create, share code and documents with others. It provides that beautiful way to document your code, text, images and query results. Jupyter notebooks in Azure data studio can run PySpark, Python 3, Spark R and Scala, as well as PowerShell. Run and visualize the data as you see the results without having to leave the environment. This is an excellent tool for documenting things. You do not have to write your long emails attaching bunch of screenshots any more. Once you start using the Jupyter notebooks, it will save you so much time and give you peace of mind. This tool can not only be used for documentation purposes but also can be used for your presentations for your demo purposes. Not only for the SQL code, this is a very nice tool for doing machine learning models, visualizing data and much more. Your code can be written in different cells which can be executed individually. This can help us to run specific part of the code in the project. This tool supports Python, R as well.
There are some magic commands that you can use to make your work easier. %clear, %autosave, %debug and %mkdir are some of the commands and these magic commands can be used in two ways. Linewise and cellwise. Linewise is when you wish to run a single command and cell-wise is when you wish to run an entire block of code.
There are also interactive dashboards in Jupyter notebooks.
When you wish to share these jupyter files and as they are in json format. You can save the notebook using different file options like a CSV, Excel, JSON or XML file . You can also convert your notebook into different formats by using the nbconvert option. For analytical queries, you can also build a chart for the data.
You can use the Jupyter notebooks for giving presentations to your customers and clients about your projects.
Jupyter notebooks is an amazing tool as you can document everything what you do. Once we start using Jupyter notebooks, you would definitely do not want to turn back to the previously used tools anymore. I recommend this tool to anyone who would like to explore and take advantage of many features the Jupyter notebooks has to offer.
With the Invitation of Brent Ozar for this month T-SQL Tuesday, I would like to share some thoughts on least favorite data type in SQL Server. Thank you Brent for hosting March month of T-SQL Tuesday.
Though VARCHAR(MAX) is suitable in situations with large strings of data, it has its own complications that we need to consider. In my career as a DBA, I at least saw couple of times SQL developers using VARCHAR(MAX) when they should not. Fixing the datatypes once in production is painful and causes risks.
When the data is stored in VARCHAR(n) datatype column, these values get stored in standard data page. VARCHAR(MAX) uses IN_ROW_Data up to 8000 bytes of data but if the data in more than 8000 bytes, it uses LOB_Data page and a pointer (reference) is stored in the IN_ROW_Data_page. We cannot create an Index on the VARCHAR(MAX) data type as a key column in the index. You can add the column of VARCHAR(MAX) as an included column but that wouldn’t be seekable and will duplicate the column completely which is lot of storage. We cannot compress the LOB data and the data retrieval from the LOB data is relatively slow. If the LOB data is stored as IN-ROW, it can be compressed but only when we compress the table with the PAGE level compression. LOB data cannot be compressed if you use ROW level compression and doesn’t depend if the LOB data is stored as in row or out of row.
Any variable you declare in VARCHAR(MAX) as the datatype will automatically spills to Tempdb which can cause performance problems. If you are using the VARCHAR(MAX) datatype column in your queries and sorting them, that requires so much memory to sort this data which may cause the memory issues. For selecting data having this datatype more than 8000 bytes can cause several off page I/O’s.
If you are running queries and searching on column, that column should always be used as a fixed length column instead of VARCHAR(MAX) data type. If your row size can be managed to be less than 8000 bytes, that will store the data IN_ROW_Data_page which can avoid any page splits and avoids locking or latching during the updates. Limiting the length of the strings is really important for designing any database. Using VARCHAR(n) is recommended over VARCHAR(MAX) when ever possible.
I am really looking forward to what other SQL family members have to say about their least and the most favorite data types!
With the Invitation of Mikey Bronowski for this month T-SQL Tuesday Tools of the Trade,I would like share some useful tools that make my job easier:
OBS Studio: This is a free and open source software for video recording and live streaming. I mostly prerecord my sessions using OBS. I personally love this tool as we have pretty much good content on YouTube that teach us how to use this tool.
SentryOne Plan Explorer: Plan explorer is an amazing tool to analyze your execution plan and tune your queries very quickly. Its completely free.
SQL Search: It is a free search for databases. If you need to find any specific object/column or any string on any database on the server, this tool is very useful. I use SQL search regularly.
SQL Compare: SQL compare tool is very useful to compare the database schema and data between environments. I use this tool regularly.
SQLQueryStress: It is a lightweight performance testing tool, designed to load test individual queries. I personally use this tool for my demos and testing any queries for performance.
Zoomit: ZoomIt is a free screen zoom and annotation tool for presentations. I use it regulary and I am sure most of the speakers already know about the tool.
Diffchecker: A very useful too to compare code/documents.
Adobe Lightroom: This is a photo editing tool which I personally liked. It is professional and used by many photographers across the world.
Free Download Manager: I use this tool for my downloads. It is much faster to download files. I use this tool especially for YouTube downloads. If you need to download the playlists at a time in a single shot, this tool is really useful. It is completely free.
Nimbus Screenshot: It is a chrome extension. This tool is really useful when you wanted to take the screenshot of entire page. You can select the area you wanted to take the screenshot by scrolling the page. Very useful.
These are some of the useful tools I use regularly. I would also like to see the list of the tools others feel useful. I would love to start using those tools as well.
It is a challenging time for each one of us during this pandemic. It is very important to be mentally strong during these hard times. Below are some of the tips I followed to take a break when we are stuck in home
Reading moral stories to kids. Kids really enjoy me reading for them. As I explain the moral values to them, I learn along with them. This will instantly calm me down and make me think the wider perspective of life.
I love arts and crafts. I love shopping for the material to make any new crafts from the ideas I have since long time. Putting something in action and seeing the craft coming out the way you saw in your mind gives so much satisfaction. Below picture is one of the crafts I made “3D wall Garden”.
3. I listen to David Goggins (Former navy seal) podcasts. He explains how a person can be mentally strong and how much a human being can accomplish just by changing the way we look at life. I started reading his book “Can’t Hurt Me”. This book has already changed me. I learned how to be mentally strong during the hardest times of life.
4. Watching thriller movies. I love watching thriller movies as they take me to a complete different world for few hours. Its a true refreshment.
5. Spending some time myself with a cup of coffee. No distractions with any social media, mobile phones or TV. Solely, its a self time. This is the best time to analyze how our mind wanders around. Controlling mind chattering is very important for mental health. Watching the thoughts pass through the mind freely and with mindfulness is like a meditation.
These are some of the tips I follow to take a break, I would love to see the tips from other participants for this month T-SQL Tuesday!
Thank you Lisa Griffin for the invitation to December month of T-SQL Tuesday.
After I started presenting, I learned many technical concepts but here I would like to quickly mention couple of them:
Memory grant connections are stored in a memory component in the SQL Server which is called as Fixed memory. As the fixed memory cannot be expanded or shrinked when there is a memory pressure, SQL server requests the other components of memory like for example, evict the plans out of the plan cache and dirty pages out of the buffer pool to release the memory as memory grants for executing the queries. These memory grants are allocated before the query goes to the execution phase. Memory grants for any query are allocated based on the hashes, sorts in the execution plan and Cardinality estimates. When there are many queries requesting more memory and when the Fixed memory is already full, these other memory components releases the memory as there is an internal pressure with in the SQL Server. Once the memory grant is allocated to the query, the memory will only be released once the query execution completes. Resource Semaphore is responsible for satisfying memory grant requests while keeping overall memory grant usages within the server limit.
Redundant joins between the tables will be only removed by the SQL Server automatically on the background only if the referential integrity is maintained by the foreign key constraint defined between both the tables. If there is no Constraint defined, then as SQL Server doesn’t know about the relationship between two tables, SQL Server will perform a join scanning one table to find matching values from other table.
SQL Server needs a NOEXPAND hint to create a query plan that uses an indexed view. When NOEXPAND is specified, automatic stats are created on index view same as it happens with the ordinary tables.
I would also like to hear from other community members of what else they have learned from presenting!
With the Taiob Ali invitation for November month of T-SQL Tuesday, I would like to share how I am managing and coping with the pandemic times. Thank you Taiob for the invitation on this topic.
There is so much stress in everyone’s life right now. There is so much going on this year that we need to keep ourselves busy hoping for the good days to come. It is a challenging time for everyone of us.
Since the pandemic started, every home turned in to a school, day care, play area and office for all of us. Thanks to all the wonderful kids around the world coping with their parents during these situations. Its really hard for kids to stay away from their outdoor activities for so many months.
It is mentally challenging during these times to mainly focus. As a mother of two kids (4 and 7 year old), I know how hard it is to all parents around the world to work and take care of the kids at the same time. It’s really tough to have our own self time anymore. I took these hard times as really challenging. I followed some of the tips to keep myself positive:
I regularly listen to motivational speeches on youtube to keep myself motivated.
Planning for the weekly goals and trying to accomplish them.
Helping kids on their homework which gives me enough time to spend with them.
Reading moral stories to kids as I feel this is so important to learn moral values. I am learning while I read to my kids. This helps me to see positivity in negative situations.
Gardening gives me peace as I spend some time with nature.
No matter how bad the situation might seem, I am believing that something better is going to come. Hope is all we need right now.
On the positive side, the best thing that happened to me during these times is I had the opportunity to start my career as a speaker. I was able to participate in several conferences starting with New stars of Data, SQL Saturdays, Dataweekender, DBCC International conference, GroupBy conference, Drupaljam reboot 2020, Google Dev Fest Tanzania 2020, and going to present for Granite state code camp, Women in Technology Summit 2021 and many other local user groups across united states and other countries like Canada.
This is only possible because of many wonderful people from SQL Family. I am very thankful to everyone for helping me during my journey as a speaker.
There is so much positivity in SQL Family and I feel very happy to be part of the community.
I wish and pray for all the families who have lost their loved ones due to COVID. May God heal you and give you peace in your lives. I am praying for all of you.
I hope there will be vaccine soon and I wish to see our lives going back to normal days as before.
One last suggestion, please wear a mask, do not go out in public until it is really necessary, spend as much time as possible with your loved ones as they are in home with you all the time during this pandemic. Try to listen to motivation and do meditation/deep breath exercises when ever you feel down. Never ever forget you are impacting someone lives with your positivity around and you might not even realize that you are creating an impact on others. You, me and we all are valuable and loved by God. We all are in this together and we will pass this pandemic together.
With Rob Volk Invitation to October month of T-SQL Tuesday for the Data Analogies, I tried thinking of an analogy for SQL Server Statistics.
Here is my analogy:
For example, One of our friends requested us to volunteer for organizing their kid first birthday party. You and me, we both are more than happy to help. We are so excited to celebrate it with all our friends inviting as many people as possible.
We gathered invitee list together making thousand people in total including adults and children. We planned to create a RSVP list with some beautiful invitation link requesting all the invitees to submit the RSVP with head count number for girl kids and boy kids along with the head count number of families without kids. We sent the invite two months before the event date expecting everyone would submit the RSVP. Based up on the number of head count (girls, boys and families without kids), we want to purchase the return gifts.
Girls get Barbie dolls, boys get Car toy and families without kids get a nice home decor set.
We waited for the RSVP submissions and gathered the results five days before the event date so we can shop for the return gifts. Out of thousand invitees, we received only three hundred RSVP’s. We believed that is the only head count we will see on our event date and we just bought return gifts and food items for three hundred people.
On our surprise, we see thousand people on the event date at the party since the party is on a weekend. We now understood that many people were actually willing to come to the party but for some reason, they did not had chance to do the RSVP. Now, we are all tensed up not knowing what to do.
This is not at all what we expect would happen. We expected something and what actually happening at this event is something else. Everything got messed up now. There is stress everywhere more than fun in the party. We immediately requested some of our friends to order for more dishes, asked some other friends to grab some random gifts from near by store. Many invitees were waiting for their turn to get the food on their plate. People frustrations were clearly seen on their faces. One situation carried to next situation in the event making the whole event a disaster. We anyway passed the event but with no satisfaction. We should have had good estimates of the head count to have the good party plan.
Instead, if we have got the good percent of RSVP’s like around 80% of the total thousand invitees, we would have been in the better shape with the plan. 80% of thousand is 800 people. This would have given us the better estimate close to the actual estimate of the head count. What if 100% of the invitees submitted the RSVP, that would be wonderful, right? Estimation would have been exact match with our actual head count in the party. WOW, that is so good to be true. There is no one single party I have seen where all invitees submitted their RSVP’s. There are always people missing RSVP’s. It is always still good to have the hundred percent estimate though. From invitees point of view, it takes time and effort to make sure they all update the RSVP’s to get the exact estimate.
For making this event successful, we should have been a little cautious while planning for getting these RSVP’s updated by the invitees. We would have send them the notification for every few days or once every week depending up on the number of invitees list. The more invitees we add to the event and the more updates we do to the list, the more frequent notification emails for our RSVP’s should be.
Now we know what we have missed. As we always learn from our past mistakes, we now learn we would have had send the notifications regularly so we get the RSVP’s updated and so we get close estimation which ultimately produces a good party plan.
What if the same concept applies to the SQL Server statistics?
SQL Server optimizer always get the estimate from statistics for how many number of rows (Invitee column statistics) it might expect to get. How does it do that? It will gather and trust the estimates being correct from statistics. Invitees is the column statistics here and RSVP’s are our statistics updates. Our notifications to update the RSVP’s is same as updating our Invitees column statistics using sql agent update stats job or through AUTO_UPDATE_STATS option enabled or through Ola hellengren scripts.
From the estimates of our RSVP’s, we ordered food and gifts for only 300 people out of 1000 people. In the same way, SQL optimizer will allocate less memory based up on the estimates. Because of the less number of the gifts we bought, we requested other friends to go and purchase gifts immediately from the near by gift store. This caused stress on our friends and they did not find all the gifts they need anyway. This made the whole party plan go bad. In the same way, due to the bad statistics estimation, Sql optimizer allocates less memory to process the query which is not sufficient and so it will finally have to spill to the disk. One bad column statistics estimate carries the stress to other operators in the query plan effecting the other operators in the plan and eventually effect complete execution plan.
To solve this issue, Sql Server invitees column stats should be updated regularly.
Depending up on the number of inserts and updates happen to this invitees column in the table, we have to plan for updating these statistics as well. As it takes energy and time for most of the invitees to go and update the RSVP’s, it takes resources and time for updating the invitees column statistics as well.
In order to get good estimates, RSVP’s should be updated as much as possible. If there are not many invitees that are being updated frequently, decent percentage of RSVP’s should be always good for good enough party plan. In the same way, decent amount of percentage stats update should be good enough to generate good plan. If updates happens regularly to the invitees column, frequent statistics updates is recommended.
It is always best to set the AUTO_UPDATE_STATISTICS option enabled. With this option enabled, Sql Server will go ahead and update these statistics automatically in the background.
For the versions SQL Server 2014 and below, the auto update stats kicks in when the threshold hits when 20% of the records gets updated+500 record change but for the later versions SQL Server 2016 and above, auto update stats kicks in when the threshold hits sqrt(1000 * total number of rows) which is way small than the older threshold.
Bottom line from this post is,
The better the estimates, the better the party plan will be 🙂
With Elizabeth Noble Invitation, this will be my first post for T-SQL Tuesday. I am very excited while writing this blog post. Thank you so much Elizabeth for sending the invite.
With the current COVID situations, I believe there would be no one on this planet without stress. I wish we had a fast forward stress relief button which can automatically remove the stress from our lives.
With new challenges we face every day comes with new opportunities. While we travel the journey making these challenges into the opportunities, we will have chances to take some smart decisions and actions making our lives easier. Automation is one of them.
As new technologies emerge, automation becoming the key. The tasks which take longer time before can be completed in just few minutes by using automation.
In my everyday job, there are many tasks that I have to do repeatedly across many servers. These tasks can be very simple but time consuming when you are dealing with some hundreds of servers. Automating these tasks can be very helpful. Below are some of the simple tasks which I automated and saved lot of time:
We have several databases that are being replicated between different SQL server versions. This includes SQL server 2008, all the way up to SQL server 2016 versions. There are some databases still on SQL 2000 compatibility level on SQL 2008 servers. Sometimes, replicating these databases to other versions of SQL servers is a challenging task. When ever we had to take a fresh snapshot and replicate, distribution job fails with errors related to default setting options(example, ANSI_PADDING set to OFF when the option should be ON) on the .sch files the snapshot generates (.sch file contains the script to create the table and the replication specific stored procedures on the subscriber). It is a very tedious task to go find all .sch files having this setting as OFF and manually change the setting to ON. So, I used a powershell script in the snapshot job added as a second step after the first snapshot creation step that changes the ANSI_PADDING setting from OFF to ON on all .sch files. When we had many databases having the same issue, automating this task saved me from so much stress.
Once, I had to change the owner of the SQL agent jobs to sysadmin account on around 300 servers at a time. This is a stressful task if I had to do it manually one by one server. This will take me at least few days to complete for all 300 servers. I used powershell script by using commands to get the server list from the text file so the same SQL query runs on each SQL server mentioned in the list. This took me few minutes of time to complete for 300 servers. Imagine how much time this process would have saved me.
There are other instances where I use powershell automation. If I had to give permissions for logins on multiple servers at a time, automating this task using powershell really helps.
I had to convert around 200 servers between SQL server editions with out losing data and the databases. I used powershell script to do this task. If I had to do this manually by uninstalling and installing different edition takes me so much time and effort.
In my work environment, there are many non-production servers. We do make sure all these databases are in simple recovery mode. Developers do create and drop databases on non-prod environments for their testing purposes with full recovery mode. I created a SQL agent job which will check the database recovery modes and send me a report of database names which are not in simple recovery mode. This job runs every day at the scheduled time.
I use automation for documentation purposes too. For example, gathering all SQL agent jobs, gathering login permissions etc; from multiple servers using SQL agent jobs.
These are some of the examples of how I use automation in my every day work life. I would like to know how automation is helping others in their day to day work life.